Teensy4.1 with 2x 8MB PSRAM Chips: external_psram_size=0 but EXTMEM char[] works as expected?

Question

So I got 2x PSRAM chips from PJRC with my Teensy4.1 and soldered them on (it's a hideous job as I lost my tips) but it's all tested with a multimeter and the connections are solid.

I ran the memtest sketch from here compiled in both Teensyduino and PlatformIO: https://www.pjrc.com/store/psram.html

Aaaaand external_psram_size is reading 0 for both cases. Setting the memory size manually in the program also fails (even with 8MB).

However... EXTMEM vars work as expected. I'm not sure how the mem addressing actually works with the provided memtest program, either - it certainly doesn't seem to use EXTMEM anywhere.

Here's my memory tester (run to crash):

EXTMEM char bigBuf[10000000];
//char bigBuf[10000];
int c = 0;
void setup() {
  c = 0;
}

void loop() {
  bigBuf[c] = 'a';
  Serial.println(bigBuf[c]);
  Serial.println(String(c));
  c = c + 1000;  
  delay(1);
}

Output with EXTMEM char[]: (as expected overflowed until 16MB)

Output with internal mem char[]: (as somewhat expected overflowed at 512K - max of RAM1)

a
0
a
100
...
a
488081
a
489081

Anyway, I'm glad EXTMEM works.. But can someone explain what's going on with the provided memtest? I admit I didn't read it over very well, but I thought that PSRAM could only be accessed via EXTMEM; how is it addressing otherwise?

Paul Stoffregen · Answer 1 · 2021-09-01T11:09:20.487

Paul here. I'm the author of that memtest program, and also the creator of Teensy. While luni64 already answered very well about your test program, hopefully I can add some clarity about the official memtest program.

Regarding "I thought that PSRAM could only be accessed via EXTMEM", indeed use of EXTMEM arrays or variables is the normal way. But it is not the only way. The memtest program uses pointers with the known address range of those memory chips. Because the wires physically connect to the "FlexSPI2" peripheral inside the chip, the memory will always start at address 0x70000000. You can find this detail on page 35 of the chip's reference manual.

https://www.pjrc.com/teensy/IMXRT1060RM_rev2.pdf

Unfortunately the manual is tough to read and confusing in so many ways. Even on page 35, it describes the memory as "ciphertext", which might be the case if the Bus Encryption Engine (mentioned on page 184) were used in a certain way. No encryption is used, and if you read page 184 you'll see that engine only does decryption, so it really only could be used for read-only memory which is pre-loaded with encrypted data.

The point is we know the RAM always starts at 0x70000000 because of the way the hardware is designed. If you look at the memtest code, you'll see it creates 2 variables called "memory_begin" and "memory_end", which are pointers used to directly access the memory within that known address range.

The idea behind the memory test is the entire memory is filled with a known data pattern. This sort of testing is meant to catch any (very unlikely) internal problems inside the memory chips, where writing to one place inside the chip might corrupt data somewhere else in the same chip.

The pointer used is declared with "volatile" which prevents the compiler from trying to optimize away actual memory access.

After the memory is completely filled, you'll see a function called arm_dcache_flush_delete() is used. Even if you use volatile to keep the compiler from optimizing away memory access, the Cortex-M7 hardware has 32K level 1 cache. This function causes the ARM Cortex-M7 processor to completely write any cached data to the actual memory chip, and then delete the data from its cache. Normally these special cache functions are only used by low-level driver code using Direct Memory Access (DMA), where you need to be sure data you've written is actually in memory before instructing a peripheral to use DMA to directly grab it from the memory (generally DMA-based peripherals can't access the cache), or before you read data a peripheral put into memory using DMA. Normally you don't need to mess with the cache, but this sort of hardware testing and special benchmarking are the other sort of application where special attention to the processor's cache is important.

Then the entire memory is read back, using the volatile pointer so the compiler doesn't try to do anything too smart. As the data is read, every 32 bit word is compared to the original pattern written. Again, the idea is to check for the pretty unlikely case where the memory appears to work, but could theoretically have an internal problem where writing to one place causes corruption somewhere else inside the chip.

The entire test is repeated many times. A variety of fixed 32 bit patterns are used. Many of the tests fill the entire memory with a pseudo-random sequence, of course doing the same verification where any change to even just 1 bit within the entire memory range will be detected and reported as a failure. Maybe that's overkill, but I tried to design the memtest program according to best practices described for testing PC memory, and those are the ways experts recommend to test memory (actually they recommend even more sophisticated & complex patterns... maybe someday I'll add those).

Hopefully this helps answer "can someone explain what's going on with the provided memtest?"

Regarding the soldering, a problem other people have seen involves the solder adhering only to the memory chip pin. Even though it looks ok when viewed from above, it's possible for the solder to be a blob resting just slightly above the pad surface on the circuit board. I would recommend re-heating the solder. Allow time for it to fully heat and flow onto the pad. While extra time heating isn't wonderful for the PSRAM chip, those chips are pretty hearty heat-size and if the memory isn't working you really don't have much to lose. Counting to 10 while reheating the solder usually works. If you have any liquid flux chemical, applying that before reheating might also help.

Focus your repair effort first on the chip mounted to the smaller set of pads next to the edge of the PCB. The startup code looks for the first 8MB chip in that location. If won't even look for the other chip if no memory is detected in the first location.

It's amusing that ASE had me reviewing your "late answer". You know, to make sure that it was up to standards. Gee, I dunno... I guess it's alright. =) — timemage, Sep 01 '21 at 14:03

score 4 · Accepted Answer · answered Feb 06 '21 at 20:40

I'm afraid your test program doesn't work as you intended. I tested it with a T4.1 without RAM soldered on and it gives the same result as in your question.

You didn't declare your array volatile. Thus, the compiler can (and probably will) simply optimize the reading back away.
Even if you declare the variable volatile the processor reads back the value from the cache line, i.e. you always get the correct value back, even without any external RAM.

Here a simple test program showing the effect. It always switches the LED on, even without external RAM soldered on the board.

EXTMEM volatile int test;

void setup()
{
    pinMode(13, OUTPUT);

    test = 42;
    if(test == 42)
    {
        digitalWriteFast(13, HIGH);
    }
}

void loop(){   
}

Here the disassembly of the setup function. It clearly shows that the processor tries to store 42 in the (non existing) external RAM and reads it back from there. Obviously the read back value comes from the cache line. (comments added by me)

void setup()
{
      7c:   push    {r3, lr}                         // pinMode code
    pinMode(13, OUTPUT);                             //
      7e:   movs    r1, #1                           //
      80:   movs    r0, #13                          //
      82:   bl  160c <pinMode>                       //-------------          

    test = 42;
      86:   ldr r3, [pc, #20]   ; (9c <setup+0x20>)  // load r3 with address of 'test' (0x70000000, stored at 0x9C)
      88:   movs    r2, #42 ; 0x2a                   // load r2 with #42
      8a:   str r2, [r3, #0]                         // store content of r2 at address stored in r3. i.e., store #42 0x7000'0000 (EXTRAM)
    
    if(test == 42)
      8c:   ldr r3, [r3, #0]                         // volatile forces the compiler to read back 'test' (store it in r3)
      8e:   cmp r3, r2                               // check if equals r2 (contains #42)
      90:   bne.n   9a <setup+0x1e>                  // goto 0x9A if not equal  
    
CORE_PIN13_PORTSET = CORE_PIN13_BITMASK;             // digitalwritefast code (next 3 lines) 
      92:   ldr r3, [pc, #12]   ; (a0 <setup+0x24>)  // ... 
      94:   movs    r2, #8                           // ...
      96:   str.w   r2, [r3, #132]  ; 0x84           // -------

      9a:   pop {r3, pc}                             // return from setup()

      9c:   .word   0x70000000                       // Address of 'test'; EXTRAM starts at 0x7000'000
      a0:   .word   0x42004000                       // Address of GPIO register for LED

Thanks for the input; it's odd EXTMEM only decides to crash at 16MB but that seems to be what's happening. You were right - I tried to actually read from a value assigned near the beginning of a 14MB EXTMEM array and got nothing. I read some stuff on the Teensy4.1 beta testing thread (Arduino forum I think) about a couple digital pins being used to detect the PSRAM chips... I'll have to look into that somehow, or hope someone has other ideas (I guess I may have fried one soldering - I'll desolder one and see if that helps). Thx again, never use C so didn't even consider volatile. — CSoft, Feb 07 '21 at 03:14

Teensy4.1 with 2x 8MB PSRAM Chips: external_psram_size=0 but EXTMEM char[] works as expected?

2 Answers2