Paul here. I'm the author of that memtest program, and also the creator of Teensy. While luni64 already answered very well about your test program, hopefully I can add some clarity about the official memtest program.
Regarding "I thought that PSRAM could only be accessed via EXTMEM", indeed use of EXTMEM arrays or variables is the normal way. But it is not the only way. The memtest program uses pointers with the known address range of those memory chips. Because the wires physically connect to the "FlexSPI2" peripheral inside the chip, the memory will always start at address 0x70000000. You can find this detail on page 35 of the chip's reference manual.
https://www.pjrc.com/teensy/IMXRT1060RM_rev2.pdf
Unfortunately the manual is tough to read and confusing in so many ways. Even on page 35, it describes the memory as "ciphertext", which might be the case if the Bus Encryption Engine (mentioned on page 184) were used in a certain way. No encryption is used, and if you read page 184 you'll see that engine only does decryption, so it really only could be used for read-only memory which is pre-loaded with encrypted data.
The point is we know the RAM always starts at 0x70000000 because of the way the hardware is designed. If you look at the memtest code, you'll see it creates 2 variables called "memory_begin" and "memory_end", which are pointers used to directly access the memory within that known address range.
The idea behind the memory test is the entire memory is filled with a known data pattern. This sort of testing is meant to catch any (very unlikely) internal problems inside the memory chips, where writing to one place inside the chip might corrupt data somewhere else in the same chip.
The pointer used is declared with "volatile" which prevents the compiler from trying to optimize away actual memory access.
After the memory is completely filled, you'll see a function called arm_dcache_flush_delete() is used. Even if you use volatile to keep the compiler from optimizing away memory access, the Cortex-M7 hardware has 32K level 1 cache. This function causes the ARM Cortex-M7 processor to completely write any cached data to the actual memory chip, and then delete the data from its cache. Normally these special cache functions are only used by low-level driver code using Direct Memory Access (DMA), where you need to be sure data you've written is actually in memory before instructing a peripheral to use DMA to directly grab it from the memory (generally DMA-based peripherals can't access the cache), or before you read data a peripheral put into memory using DMA. Normally you don't need to mess with the cache, but this sort of hardware testing and special benchmarking are the other sort of application where special attention to the processor's cache is important.
Then the entire memory is read back, using the volatile pointer so the compiler doesn't try to do anything too smart. As the data is read, every 32 bit word is compared to the original pattern written. Again, the idea is to check for the pretty unlikely case where the memory appears to work, but could theoretically have an internal problem where writing to one place causes corruption somewhere else inside the chip.
The entire test is repeated many times. A variety of fixed 32 bit patterns are used. Many of the tests fill the entire memory with a pseudo-random sequence, of course doing the same verification where any change to even just 1 bit within the entire memory range will be detected and reported as a failure. Maybe that's overkill, but I tried to design the memtest program according to best practices described for testing PC memory, and those are the ways experts recommend to test memory (actually they recommend even more sophisticated & complex patterns... maybe someday I'll add those).
Hopefully this helps answer "can someone explain what's going on with the provided memtest?"
Regarding the soldering, a problem other people have seen involves the solder adhering only to the memory chip pin. Even though it looks ok when viewed from above, it's possible for the solder to be a blob resting just slightly above the pad surface on the circuit board. I would recommend re-heating the solder. Allow time for it to fully heat and flow onto the pad. While extra time heating isn't wonderful for the PSRAM chip, those chips are pretty hearty heat-size and if the memory isn't working you really don't have much to lose. Counting to 10 while reheating the solder usually works. If you have any liquid flux chemical, applying that before reheating might also help.
Focus your repair effort first on the chip mounted to the smaller set of pads next to the edge of the PCB. The startup code looks for the first 8MB chip in that location. If won't even look for the other chip if no memory is detected in the first location.