The reason A64s are so good for games and P4s so good for media editing/encoding and other such tasks is because of pipelining. There are four basic steps to processing a CPU instruction. First of all, the instruction needs to be fetched from the memory or the CPU cache. The instruction is just a binary value which needs to be decoded so that the CPU knows what it is and what it needs to do with it (this is the second stage). The third stage is the execution of the instruction and the fourth and final stage is storing the result in the correct location in the memory so that the program can continue. In older processors, one instruction would be handled at a time. It would go through each stage in turn and the result would be stored before the next instruction could be fetched.
Pipelining is found in modern processors and means that as soon as one instruction has reached the second stage (the decoding) the next one will be fetched from memory. The instructions will then move along in line so as soon as the first one reaches the third stage, the second goes to the second stage and a new instruction is fetched and so on and so forth.
There is a problem with pipelining though and that is program branches. As most programs are not linear sets of instructions the outcome of one instruction may require the program to jump to another place in memory and continue from there. The problem is that the pipeline will be full of linear instructions and therefore will need to be flushed and filled again with the new set of instructions. I think you can see where this is going now. The deeper the pipeline, the longer it will take for the new instructions to filter through and start producing results.
Pentium 4s have deep pipelines so the penalty for not predicting that a branch will occur will be far more severe than a CPU such as the Athlon 64 which has much shallower pipelines. This is why P4s will excel at media encoding because all that is is just a bunch of linear instructions, they're set out, the program isn't going to jump from encoding one part of a song to converting an entirely different part. The majority of software you come across will be more.. um.. "branchy" so a chip with a shallower pipeline such as the Athlon 64 will benefit.
Hope it helps.