• Home
  • Help
  • Search
  • Login
  • Register
Pages: [1] 2
Author Topic: fw_printenv offsets  (Read 9509 times)
fsk141
Guest
« on: February 05, 2010, 03:06:26 PM »

Hai,

I just compiled u-boot-env (fw_printenv/setenv) for my pogoplug (for plugapps) and am having a little issue setting the correct offsets. Can someone help me setup my /etc/fw_env.config.

I have a pinkie (pogoplug v2)

 /etc/fw_env.config:

# MTD device name       Device offset   Env. size       Flash sector size
/dev/mtd0               0x20000       0x1000        0x2000

------
I've tried a bunch of things, I either get:
Warning: Bad CRC, using default environment
bootcmd=bootp; setenv bootargs root=/dev/nfs nfsroot=${serverip}:${rootpath} ip=${ipaddr}:${serverip}:${gatewayip}:${netmask}:${hostname}::off; bootm
bootdelay=5
baudrate=115200

-- or --
Too few good blocks within range

[fsk141@Pogoplug uboot-env]$ cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00100000 00020000 "u-boot"
mtd1: 00400000 00020000 "uImage"
mtd2: 02000000 00020000 "root"
mtd3: 05b00000 00020000 "data"

------

this is all the information I can think to give? Let me know if you need anything else
Logged

pingtoo
Sr. Member
****

Karma: 15
Posts: 318


View Profile
« Reply #1 on: February 05, 2010, 03:26:08 PM »

Do you have known case this will work?

Several people have tried with various addresses none seems to be able to do so. I know I read some of the old threads about this topic but I can not find them right now.

Let me know if you got it working.

I know from look at the source code it should be at end of u-boot and from sheevainstaller it seems to be at 0xa0000 for size of 0x40000. but when I tried on plug it does not work.
Logged

Good Luck Smiley

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #2 on: March 13, 2010, 06:19:28 AM »

Several people have tried with various addresses none seems to be able to do so. I know I read some of the old threads about this topic but I can not find them right now.
[...]
I know from look at the source code it should be at end of u-boot and from sheevainstaller it seems to be at 0xa0000 for size of 0x40000. but when I tried on plug it does not work.

Thorugh a process of looking at the raw data at /dev/mtd0 with my eyes crossed, scratching my head for about 25 seconds and gaoimg with a hunch, I discovered that these values work for /etc/fw_env.config work on older (pre 1.3) plugs:
Code:
# MTD device name       Device offset   Env. size       Flash sector size
/dev/mtd0               0xa0000         0x20000         0x20000

With these values, I was able both to read and write values from uBoot's environment on an older plug. On an 1.3 plug, however, I get "Warning: Bad CRC, using default environment".
Logged

pingtoo
Sr. Member
****

Karma: 15
Posts: 318


View Profile
« Reply #3 on: March 13, 2010, 07:45:39 AM »

Several people have tried with various addresses none seems to be able to do so. I know I read some of the old threads about this topic but I can not find them right now.
[...]
I know from look at the source code it should be at end of u-boot and from sheevainstaller it seems to be at 0xa0000 for size of 0x40000. but when I tried on plug it does not work.

Thorugh a process of looking at the raw data at /dev/mtd0 with my eyes crossed, scratching my head for about 25 seconds and gaoimg with a hunch, I discovered that these values work for /etc/fw_env.config work on older (pre 1.3) plugs:
Code:
# MTD device name       Device offset   Env. size       Flash sector size
/dev/mtd0               0xa0000         0x20000         0x20000

With these values, I was able both to read and write values from uBoot's environment on an older plug. On an 1.3 plug, however, I get "Warning: Bad CRC, using default environment".
Do you know what version of fw_setenv/fw_printenv it base from mtd-utils?
Logged

Good Luck Smiley

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #4 on: March 13, 2010, 08:25:54 AM »

Do you know what version of fw_setenv/fw_printenv it base from mtd-utils?

It's the one from  debian squeeze, which seems rather old: version information is "20081215-2".
Logged

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #5 on: March 13, 2010, 03:36:06 PM »

With these values, I was able both to read and write values from uBoot's environment on an older plug.

DO NOT TRY TO WRITE TO YOUR ENVIRONMENT WITH DEBIAN'S FW_SETENV

If you are lucky, it will fail. If you are unlucky, it will mess up your environment's CRC. fw_printenv will be completely happy about it, but u-boot will reset to the default environment on your next boot, probably messing up your boot process. It's not bad enough to brick your plug, but you will need to re-set the environment to your liking.

I'm trying to figure out what is going wrong here: two different plugs, with identical u-boot versions, produce very different results, and that is plain weird. To make matters worse, the code in fw_printenv is ugly as hell, so I'm still trying to figure out why u-boot and fw_{print,set}env compute different CRCs for the same environment.
Logged

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #6 on: March 13, 2010, 06:12:14 PM »

I'm still trying to figure out why u-boot and fw_{print,set}env compute different CRCs for the same environment.

In the meantime, I found out that fw_printenv is computing crc32 correctly. So it seems that u-boot does not use crc32 but another checksum algorithm?
Logged

pingtoo
Sr. Member
****

Karma: 15
Posts: 318


View Profile
« Reply #7 on: March 13, 2010, 10:05:16 PM »

I'm still trying to figure out why u-boot and fw_{print,set}env compute different CRCs for the same environment.

In the meantime, I found out that fw_printenv is computing crc32 correctly. So it seems that u-boot does not use crc32 but another checksum algorithm?

I think it is crc7, I check u-boot code.
Logged

Good Luck Smiley

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #8 on: March 14, 2010, 12:28:21 PM »

I think it is crc7, I check u-boot code.

I seriously doubt it: I couldn't find even a reference to crc7 in the u-boot source code.

Yet, we are getting closer!

The reason why fw_printenv computes a different CRC than u-boot is because they are operating on different data, because of read errors due to using the wrong ECC mechanism. From what I can read in another thread http://plugcomputer.org/plugforum/index.php?topic=117.msg654#msg654, the problem is that u-boot handles the NAND with 4-bit ECC, while Linux tries to handle it with 1-bit ECC, because the latter seems to be more reasonable for the specific type of NAND used in the Sheevaplug (don't ask me why: I'm basically rephrasing te contents of that post, I am only beginning to understand how this whole NAND business works). This ECC information seems to be stored inside some out-of-band (OOB) data that the NAND memory sets aside for each block.

Fortunately, there is a solution: the package mtd-utils has handy utilities such as nanddump and nandwrite that allow us to do I/O to the NAND in raw mode, bypassing the kernel's ECC logic and enabling access to the OOB data, which is normally hidden from the user. With this utilities, we can dump of the environment's contents, edit it to our hearts content, then write it back.

The only problem: when we write it back, we need to do so with 4-bit ECC, so u-boot will be able to read it. I haven't found a way to tell nandwrite that it should write with 4-bit ECC (it only has a flag to turn ECC on or off), but maybe it's there, and I'm too dumb to find it. Even if it's not there, we can still compute the whole ECC thing by ourselves in the user side, and write the data back with pre-computed ECC and all in raw mode, which nandwrite supports.

The only problem now is that I still haven't found any reference on how this 4-bit ECC is supposed to be computed and stored, but as soon as we get that sorted out, we should be good to go.
Logged

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #9 on: March 14, 2010, 05:13:47 PM »

I have confirmed the suspicion that the CRC mismatch is due to ECC, so we should be a very small distance away from a solution. However, I can't seem to find any reference about which ECC is used, nor how it is encoded in the NAND's 64-byte OOB sector. There's a file called drivers/mtd/nand/nand_ecc.c within u-boot sources, but from what little I could gather from the comments, it seems to be a 1-bit ECC, not the 4-bit ECC we are looking for.

If anyone knows about how to compute and encode the OOB data, I'm sure we could have a working solution within hours.

« Last Edit: March 14, 2010, 05:21:11 PM by e-squizo » Logged

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #10 on: March 15, 2010, 07:26:30 AM »

Again, trying to figure out the layout by squinting at an hex dump of the OOB data that is present every 2048 bytes, it looks to me like it contains:

  • 24 bytes set to 0xff, regardless of the payload values;
  • 40 bytes of ECC data, apparently structured in 4 10-byte blocks, each corresponding to a 512-byte data block. The ECC for 512 contiguous 0x00 bytes seems to be all zeros, the ECC for all 0xff seems to be all 0xff.

This is consistent with info I gathered by searching the web for "nand OOB layout 4-bit ECC" (and about a dozen other combinations of search terms), although it would seem to me that those 24 initial bytes set to 0xff ought to be put to some use, but maybe they are only used if the sector is bad or something.

By the way, only data in /dev/mtd0 seems to have this layout: /dev/mtd1's and /dev/mtd2's OOB data seems to have an entirely different layout, which is consistent with the theory that /dev/mtd0 uses a different ECC than the rest.

Now, all we need is to identify the ECC, and algorithm to implement it. Of course, the code to do this ought to be available inside u-boot, since it can read/write the environment, but I haven't been able to find where in u-boot this computation is done :-(

« Last Edit: March 15, 2010, 07:41:48 AM by e-squizo » Logged

pingtoo
Sr. Member
****

Karma: 15
Posts: 318


View Profile
« Reply #11 on: March 15, 2010, 09:03:40 AM »

Again, trying to figure out the layout by squinting at an hex dump of the OOB data that is present every 2048 bytes, it looks to me like it contains:

  • 24 bytes set to 0xff, regardless of the payload values;
  • 40 bytes of ECC data, apparently structured in 4 10-byte blocks, each corresponding to a 512-byte data block. The ECC for 512 contiguous 0x00 bytes seems to be all zeros, the ECC for all 0xff seems to be all 0xff.

This is consistent with info I gathered by searching the web for "nand OOB layout 4-bit ECC" (and about a dozen other combinations of search terms), although it would seem to me that those 24 initial bytes set to 0xff ought to be put to some use, but maybe they are only used if the sector is bad or something.

By the way, only data in /dev/mtd0 seems to have this layout: /dev/mtd1's and /dev/mtd2's OOB data seems to have an entirely different layout, which is consistent with the theory that /dev/mtd0 uses a different ECC than the rest.

Now, all we need is to identify the ECC, and algorithm to implement it. Of course, the code to do this ought to be available inside u-boot, since it can read/write the environment, but I haven't been able to find where in u-boot this computation is done :-(


I wonder where I get the impression it was using CRC7 Embarrassed

Anyway, I think I figure out the problem why the fw_setnv/fw_printenv is not working for MTD from Linux. it is because the utility is read/write the mtd device as raw stream without consider the OOB blocks. so the end result will always have CRC error because extra OOB data in the mix.

I propose we simply skip finding ECC calculation routines instead we let existing utility do the work. we can use program from mtd-utilis to dump the content from mtd0 then use the sheeva-installer supplied fw_setenv to modify the dump then use nandwrite to put modified dump back to mtd0.

Currently I am in office and do not have access to plug but I envision we can use following code:
Code:
linux root# nanddump -o -b -s 0xa0000 -l 0x20000 -f /tmp/uboot-env.bin /dev/mtd0
linux root# cd /tmp
linux root# fw_setenv env_var_name=new_value #<= fw_setenv from sheeva-installer
linux root# nandwrite -a -p -s 0xa0000 /dev/mtd0 /tmp/uboot-env.bin

Above code may have some error. I am not sure the mtd-utils can take hex-decimal as argument or not. And I suspect the nandwrite may have different idea on how to fill in the OOB area that may need some patches.
Logged

Good Luck Smiley

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #12 on: March 15, 2010, 09:53:25 AM »

Anyway, I think I figure out the problem why the fw_setnv/fw_printenv is not working for MTD from Linux. it is because the utility is read/write the mtd device as raw stream without consider the OOB blocks. so the end result will always have CRC error because extra OOB data in the mix.

No, that is definitely not the problem. fw_{print,set}env are not reading /dev/mtd0 in raw mode, and they don't get to see the OOB data. This can be seen both in the data that they read, and in the fact that whenever you use the to access the NAND, the operating system complains about unrecoverable errors once for each 2KB-block (you can see the messages in the console, or in dmesg).

This is an indication of the real problem: those programs attempt to read the data in normal mode, so the kernel tries to use the ECCs it expects in the OOB data to correct eventual errors. But the kernel expects a different ECC than the one that is actually stored in the OOB data, and so it detects an unrecoverable error --- but not before it has attempted a correction or two, so the data that the programs get is actually different from the data that is stored, hence the CRC mismatch.

All of this isn't guessing: I have actually compared the data that fw_printenv gets with the data read by nanddump -s 0xa0000 -n -o /dev/mtd0, and they are different.

Quote
I propose we simply skip finding ECC calculation routines instead we let existing utility do the work. we can use program from mtd-utilis to dump the content from mtd0 then use the sheeva-installer supplied fw_setenv to modify the dump then use nandwrite to put modified dump back to mtd0.

This is the course of action I am pursuing at the time, but I am afraid that it won't be so simple. Here's what works, and what doesn't.

  • To read the environment data, we can simply do nanddump -s 0xa0000 -n -o -l 0x20000 to dump it to stdout. This is the Really Simple Approach, because it doesn't take into account the possibility that the NAND may indeed have flipped a bit, in which case we are in deep doo-doo. The better solution would be to use nanddump -s 0xa0000 -n -l 0x20000 to get the data with OOB data inline, and feed that to a filter to interpret the ECC and correct any errors it finds, and maybe even check that the CRC is kosher.
  • Either use dd, tr and sed to convert the output of the last command into a text file that can be edited with your favourite editor, or let the filter described in the previous point do this job.
  • Once the file looks the way you want it, re-format it into a 128KB file, with leading CRC32 in big-endian order at the first 4 bytes, the environment immediately afterwards, and padded with 0x00, and write that using nandwrite. But this won't work if we attempt the simplistic approach of just doing nandwrite -x 0xa0000.

The problem with the last point is that if we attempt to write the data like that, nandwrite will write the data with a 1-bit ECC, while u-boot expects 4-bit ECC... we have the opposite problem than we had at the beginning: we can read the data, but U-Boot can't. We could use nandwrite -n to write the data without ECC, but that will leave the NAND in an inconsistent state, which u-boot will not be able to read either.

The only clean way I see out of this is to compute the ECC ourselves in user space, and use nandwrite -o to write both the payload and the precomputed OOB data to the NAND. The program that does this could also do the reformatting, padding and CRC computation described in the last step.

This makes everybody happy, the only thing standing between me and having working code is the fact that I still don't know how the ECC is computed (and no: the functions in drivers/mtd/nand/nand_ecc.c are no help, they seem to be doing 1-bit ECC).

If we figure that out, we're there.
Logged

pingtoo
Sr. Member
****

Karma: 15
Posts: 318


View Profile
« Reply #13 on: March 15, 2010, 10:29:05 AM »

...

No, that is definitely not the problem. fw_{print,set}env are not reading /dev/mtd0 in raw mode, and they don't get to see the OOB data. This can be seen both in the data that they read, and in the fact that whenever you use the to access the NAND, the operating system complains about unrecoverable errors once for each 2KB-block (you can see the messages in the console, or in dmesg).
I think this may be a misunderstanding or wrong terminology. the fw{print,set}env using read/write system call to do its I/O and I see no evidences about skipping OOB blocks so that is why I think it is the problem.

Since /dev/mtd0 is block device (I think) may be we can still do my suggestion with /dev/mtdchar0?
...

The problem with the last point is that if we attempt to write the data like that, nandwrite will write the data with a 1-bit ECC, while u-boot expects 4-bit ECC... we have the opposite problem than we had at the beginning: we can read the data, but U-Boot can't. We could use nandwrite -n to write the data without ECC, but that will leave the NAND in an inconsistent state, which u-boot will not be able to read either.

The only clean way I see out of this is to compute the ECC ourselves in user space, and use nandwrite -o to write both the payload and the precomputed OOB data to the NAND. The program that does this could also do the reformatting, padding and CRC computation described in the last step.

This makes everybody happy, the only thing standing between me and having working code is the fact that I still don't know how the ECC is computed (and no: the functions in drivers/mtd/nand/nand_ecc.c are no help, they seem to be doing 1-bit ECC).

If we figure that out, we're there.
do you have u-boot source from CD or download from wiki? I see there is a drivers/mtd/nand/nand_ecc_rs.c may be that is where the 4 bit calculation code is. I read the file but it is beyond my little head can understand Grin
Logged

Good Luck Smiley

e-squizo
Newbie
*

Karma: 3
Posts: 22


View Profile
« Reply #14 on: March 15, 2010, 10:44:16 AM »

I think this may be a misunderstanding or wrong terminology. the fw{print,set}env using read/write system call to do its I/O and I see no evidences about skipping OOB blocks so that is why I think it is the problem.

That is why it's called "out-of-band" data: read/write system calls by themselves will skip the OOB data. If you want to read/write OOB data, you must do it thorugh the MEMREADOOB respectively MEMWRITEOOB ioctl.

Quote
Since /dev/mtd0 is block device (I think) may be we can still do my suggestion with /dev/mtdchar0?

No, you are thinking of /dev/mtdblock0. /dev/mtd0 is a character device.

Quote
do you have u-boot source from CD or download from wiki? I see there is a drivers/mtd/nand/nand_ecc_rs.c may be that is where the 4 bit calculation code is. I read the file but it is beyond my little head can understand Grin

I got it from denx's git repository.  I just downloaded the sources from Marvell, and indeed, there it is! It's an implementation of Reed-Solomon, a popular ECC, which fits the interpretation of the data. I'll be working on the filters, I'm confident that I'll have a solution available soon!
« Last Edit: March 15, 2010, 10:57:48 AM by e-squizo » Logged

Pages: [1] 2
Print
Jump to: