While I understood how bitcoin mining worked, there were still a few strings loose in my mind on why exactly was the 'Proof-of-Work' algorithm created that way? (i.e. finding a nonce that results in a hash lower than the target value)

"A* nonce is an abbreviation for "***number only used once**," which, in the context of cryptocurrency mining, is a number added to a hashed—or encrypted—block in a blockchain that, when rehashed, meets the difficulty level restrictions. **The nonce is the number that blockchain miners are solving for**. *When the solution is found, the blockchain miners are offered cryptocurrency in exchange.*

**A target hash is a numeric value that a hashed block header (which is used to identify individual blocks in a blockchain) must be less than or equal to in order for a new block to be awarded to a miner.**

*The Bitcoin network adjusts the difficulty of mining by raising or lowering the target hash in order to preserve an average 10-minute interval between new blocks.*

*The block header contains the block version number, a timestamp, the hash used in the previous block, the hash of the Merkle Root, the nonce, and the target hash. The block is generated by taking the hash of the block contents, adding a random string of numbers (the nonce), and hashing the block again.*

*Determining which string to use as the nonce requires a significant amount of trial-and-error, as it is a random string. A miner must guess a nonce, append it to the hash of the current header, rehash the value, and compare this to the target hash. If the resulting hash value meets the requirements (golden nonce), the miner has created a solution and is awarded the block.*

*It is highly unlikely that a miner will successfully guess the nonce on the first try, meaning that the miner may potentially test a large number of nonce options before getting it right. ***The greater the difficulty—a measure of how hard it is to create a hash that is less than the target—the longer it is likely to take to generate a solution.***"*

Ultimately it is all about guessing a nonce and calculating the hash and comparing it. **Hence the capacity of a bitcoin mining farm is calculated in terms of hash rate **- i.e. number of hashes that can be computed per second. The term 'Bitcoin mining' is actually misleading as what the miners are actually doing is finding a hash that satisfies the challenge and this also validates the transactions in the block (and the block gets added to the chain).

But as many miners jump on the bandwagon, there is lot of wastage of compute cycles and this is a controversial topic for many people.