• Home
  • Help
  • Search
  • Login
  • Register
Pages: [1] 2 3
Author Topic: Dead SD Card after 3 weeks : caused by Sheevaplug or something else?  (Read 23571 times)
ckaos
Newbie
*

Karma: 0
Posts: 4


View Profile
« on: December 12, 2009, 07:16:52 PM »

Hello everyone !

After a few weeks of tinkering with the SheevaPlug and UBoot, i finally managed to get Debian booting and running great on a 8gb Patriot sd card , but after leaving the plug unpowered for about a week it turns out the card seems to be dead (no boot in sheevaplug, not beeing reckognised or working normally in Ubuntu 8.10 or winxp)

I searched around for a possible cause , since the card was brand new : it seems the SheevaPlug's sd port is a bit too tight indeed, but i hope that is not the source of the problem : did  anyone run into similar problems? or did i just have a case of bad luck with the card?

Thanks in advance!
Logged

MarcN
Newbie
*

Karma: 0
Posts: 2



View Profile WWW
« Reply #1 on: January 06, 2010, 04:33:28 PM »

Over the last few months I've had two 2GB PNY SD cards die, the most recently two days ago.  This time I used a 4G SanDisk SD and today (!) it completely died.  I put it in another linux box and there is no longer any partition table. 

I've been running debian/Lenny from Martin Michlmayr's bootstrap (http://www.cyrius.com/journal/debian/kirkwood/sheevaplug).  It sees very light duty -- fetchmail every 15 minutes and some music sharing using gnump3d where the mp3s are on a large powered external USB disk.

Any suggestions or ideas on where to debug?  I'm getting tired of buying SD cards.
Logged

fedor
Newbie
*

Karma: 1
Posts: 31


View Profile
« Reply #2 on: January 07, 2010, 04:13:32 AM »

Can the cards be reformatted and used or are they ruined?

It'd be interesting to quantify what sort of life one can get from these SD cards in such a device and how best to maximise it. The internal NAND in mine has heaps of bad blocks it seems. So much so I've taken to running everything off mechanical USB hard drive.

Logged

mgillespie
Full Member
***

Karma: 7
Posts: 239



View Profile
« Reply #3 on: January 08, 2010, 03:26:45 PM »

Use the cxbiker special kernel with IO transactions and then use iotop to see where writing is occuring the most.
Logged

cjm
Jr. Member
**

Karma: 6
Posts: 69


View Profile
« Reply #4 on: February 03, 2010, 07:21:24 PM »

I've  been wondering about the reliability of SD cards in the last few months and there seem to be some issues that are worth mentioning:

  • There's a bug in the Linux MMC driver that will cause the card to be virtually "ejected" during suspend/hibernate operations and, upon resume, mounted again. Sometimes, it ends up with a different /dev name because the old name is still in use by a filesystem. When the filesystem finally wants to write the superblock (#0), the partition table is no longer there and for whatever reason, the superblock ends up in the MBR. Restoring the MBR fixes this issue until the next suspend... Probably not your issue, though
  • SD cards are mostly MLC flash RAMs with a lifetime of around 10,000 write cycles per cell (unless you got a SLC card which would live for around 100,000 write cycles per cell). Without wear-leveling, this is nothing and you could wreck an SD card in a few minutes. While most SD cards have some sort of wear leveling, some controllers seem to base their algorithms on free blocks which, lacking a command to declare a block to be free (in the past), is based on looking at the FAT. Of course, this doesn't work when formatting the card with anything but a FAT filesystem...
  • If wear-leveling is based on free blocks and using a non-FAT filesystem, wear leveling will effectively stop after the filesystem has written to all blocks at least once -- even if there's free space, the controller has no way to find out.
  • More recent/better SD card controllers seem to also consider copying "used" blocks around but it's hard to tell which kind of controller is in which card. It might be possible to detect this by the reduction in write speed once each sector has been written at least once but I never tried this. Sandisk, BTW, are somewhat open about the wear-leveling used on their cards and one might be able to get this kind of information when asking them.


When using SD cards under Linux, it seems to be a good idea to use ext2 vs. ext3 because ext3 causes even more writes for the journal. You should also consider using the deadline elevator for SD cards, telling it to focus on reads while collecting write requests in the cache for some time. I'm using the following script in /etc/rc.local to get this done when booting:


for i in /sys/block/mmcblk*/queue; do
  echo deadline >$i/scheduler
  echo 5000 >$i/iosched/write_expire
  echo 500 >$i/iosched/read_expire
  echo 1 >$i/iosched/fifo_batch
  echo 4 >$i/iosched/writes_starved
done


BTW, I'm using the SD card only for /boot and /var/log to reduce harddisk spinups; the rests of the system is on a USB harddisk (actually, it should be eSATA but this is not yet working with the enclosure I got). I currently have a 2GB SD card from Sandisk (cheap). Let's see how long it lasts

Thanks,
--Christian
« Last Edit: February 04, 2010, 02:38:04 AM by cjm » Logged

UnaClocker
Full Member
***

Karma: 0
Posts: 131



View Profile WWW
« Reply #5 on: February 04, 2010, 10:20:56 AM »

I have 31 days of uptime on my Sandisk class 4 8gb card. It's my primary filesystem, I'm using ext2. Really, the only thing on this card that is constantly being written and erased are the system logs, most notably the http log.. I've probably had that card in use for about 38 days total, at this point..
Logged

SheevaPlug - 8gb class 4 SDHC primary drive, 4tb 3.5" media drive, Debian Wheezy, nginx, Samba, Shorewall

PlugPBX
Newbie
*

Karma: 4
Posts: 22


View Profile
« Reply #6 on: February 07, 2010, 12:52:41 PM »

I just had something similar occur with PlugPBX after 5 weeks of runtime on a SheevaPlug with an A-Data 4 gig Class 6 card.

It pretty much didn't do anything expect answer pings. I connected via USB debug and got the following repeating over and over again.

Code:
[3604963.770000] mvsdio mvsdio: FIFO_EMPTY bit missing
[3604963.770000] EXT2-fs error (device mmcblk0p2): read_block_bitmap: Cannot read block bitmap - block_group = 14, block_bitmap = 458752
[3604963.790000] Remounting filesystem read-only
[3604963.790000] mmcblk0: retrying using single block read
[3604964.810000] mvsdio mvsdio: FIFO_EMPTY bit missing
[3604964.820000] mmcblk0: error -110 sending read/write command, response 0x400d00, card status 0x400d00
[3604964.830000] end_request: I/O error, dev mmcblk0, sector 2708988
[3604965.840000] mvsdio mvsdio: FIFO_EMPTY bit missing
[3604965.850000] mmcblk0: error -110 sending read/write command, response 0x400d00, card status 0x400d00

Its almost like someone removed the card (nobody did) and the unit is on UPS. Power cycling got everything up and running again.

Are there any known issues with kernel versions and mmc/sd card usage with the Sheeva over time? The system was up for around 5+ weeks issue free prior to this.

I've since applied some of the suggested changes here to rc.local. FYI using a setup based on Martin Michlmayr's bootstrap, running 2.6.30-2-kirkwood on Debian Squeeze. I also have had swap turned off the entire time (no swap used). System has had lots of ram un-used even with cache filled up thus far...

Anyone else have long term issues with SD card errors like these after awhile?
Logged

cjm
Jr. Member
**

Karma: 6
Posts: 69


View Profile
« Reply #7 on: February 15, 2010, 03:47:10 PM »

I forgot to mention another piece of information regarding wear leveling: Some controllers seem to split sd-cards into fixed wear leveling units for simplicity. An older Sandisk document mentioned 4MB for a wear-leveling unit which, given 128KB as erase block size, means there's only 32 erase blocks that will ever be considered for wear leveling, plus some spare erase blocks.

Adding spare erase blocks, say 3 per wear-leveling unit, those cards end up having around 37 erase blocks per wear-leveling unit. Given this scenario and a lifetime of 10,000 write cycles, you would end up with 370,000 write cycles on a particular wear-leveling unit, not counting additional stress caused by copying around used blocks as part of the actual wear leveling if you're not simply writing to the same sector every time. Not exactly very much...

On my sd-card, I get roughly 900 writes per day for /var/log with a total filesystem size of around 5MB. Being pessimistic (i.e. assuming most writes end up in the same 4MB unit), this might result in a total lifetime of the card of 411 days. The underlying math is, of course, much more complex and depends on the distribution of writes but I assume the 411 days are not too far off in my use case.

Another issue is what happens when power is lost during write access (remember: the sd-card controller typically needs to read 31 sectors, merge the 1 sector you want to write and then flash the full 32 sectors of an erase unit). I recently used the plug to cross-compile a dbox2 PowerPC image (the little plug actually performed admiringly for this task) but a bug in /bin/dash caused all memory to be consumed and left the plug in a vicious out-of-memory-killer loop. I had to pull the plug and, sure enough, after rebooting one of the log files in /var/log contained garbage somewhere in the middle.

All this leads me to the conclusion that using sd-cards as a root filesystem should only be considered for test purposes. If you value your data:

  • Use a "good old spinning" harddisk (2.5" units are quiet and don't require much power). In the light of plug power supplies dying when powering external USB hard drives, make sure you got a separate power supply for the disk. This is also true for SSDs
  • Use a good-quality SSD disk with a proper wear-leveling algorithm and, preferably, SLC flash
  • Find an sd-card vendor that explains their wear-leveling algorithms in such detail that you can be sure wear-leveling is across the whole sd-card and somewhat smart.

BTW, there appears to be a wear-leveling algorithm designed for sd-card controllers that might solve most issues for good -- http://www.cis.nctu.edu.tw/~lpchang/papers/sac2007_ppt.pdf -- but I have yet to find out which sd-cards use this alrogithm...

Thanks,
--Christian
« Last Edit: March 07, 2010, 12:32:05 PM by cjm » Logged

PlugPBX
Newbie
*

Karma: 4
Posts: 22


View Profile
« Reply #8 on: February 17, 2010, 07:43:29 AM »

I think i've found a nice way of using SD media for the root filesystem, without the wear and tear to it Wink

apt-get install flashybrid

Sets up /etc and /var as ram disks. Syncs them back to SD media during shutdown. Root is mounted readonly, but its integrated into apt to allow read/write during user upgrades etc. There is also a command to 'force' a sync of the ram disks contents to the SD media anytime you wish, in addition to the shutdown scripts syncing.

Already tried it and it works excellent!  (Tried on Debian Squeeze)
Logged

mundhra
Newbie
*

Karma: 1
Posts: 36


View Profile
« Reply #9 on: February 17, 2010, 09:50:24 AM »

plugpbx: i checked out flashybrid and it looks like it works, but my root is still writeable. any ideas?
Logged

PlugPBX
Newbie
*

Karma: 4
Posts: 22


View Profile
« Reply #10 on: February 17, 2010, 10:09:57 AM »

The 'mountro' command is puking... which is used in the scripts by flashybrid
Code:
root@PlugPBX:/var# mountro
mount: / is busy

This certainly reduces the vast 'writes' to SD card that /var would see, so Im still happy with this...

There must be processes still using files outside of /var and /etc - can't seem to remount root as read-only.

mounro is...

Code:
NAME
       mountro - remount the root file systems as read only

SYNOPSIS
       mountro

DESCRIPTION
       mountro remounts the root filesystem as read only. The script is just a
       wrapper to a normal mount command.

       mount -o mount,ro /

A script installed by flashybrid

« Last Edit: February 17, 2010, 10:15:40 AM by PlugPBX » Logged

mundhra
Newbie
*

Karma: 1
Posts: 36


View Profile
« Reply #11 on: February 17, 2010, 03:23:44 PM »

yeah, /var and /etc are better than nothing. maybe i'll check out iotop and see if i can make sense of anything.
« Last Edit: February 17, 2010, 03:26:13 PM by mundhra » Logged

PlugPBX
Newbie
*

Karma: 4
Posts: 22


View Profile
« Reply #12 on: February 17, 2010, 03:28:52 PM »

Hang on a sec Smiley

Have a look in /etc/flashybrid

Code:
-rw-r--r--  1 root root 1166 2010-02-17 15:21 config
-rw-r--r--  1 root root 1050 2008-09-22 03:27 ramstore
-rw-r--r--  1 root root  628 2008-09-22 03:27 ramtmp

One is globall settings including how big of a ram disk to use,
one lets you specify what directories get 'ramdisked'
and the last one lets you dictate what gets ramdisked, but not synced back to your persistent storage

Very slick.

Could be for a setup, you have to 'cherry pick' anything that can't be readonly into 'ramstore' or 'ramtmp' if you need it to be persistent or not...

I might do a shotgun test and just specify all of my /var for a test (after purging cached packages and making ram disk much bigger lol)

As an example here is how the default install is setup

Code:
root@PlugPBX:/etc/flashybrid# cat ramstore
# This is a list of directories that are used to store variable data
# that is written to during normal operation of the system, and that should
# be preserved across reboots.
#
# On boot, the listed directories on your flash disk will be copied to the
# ram disk, and then bind mounts will be used to make the directories on
# the ram disk replace them.
#
# On shutdown, the contents of those directories on the ramdisk will be
# rsynced back to the flash disk.
#
# Directories listed here that are not present will be silently ignored.
# Do not put the trailing slash on directories!

# Should always be in ram disk.
/etc
/var/lib/alsa
/var/lib/asterisk
/var/lib/discover
/var/lib/dbus
/var/lib/exim4
/var/lib/op-panel
/var/lib/logrotate
/var/lib/mysql
/var/log
/var/run
/root

# If you don't use devfs/udev, you will want /dev in ram, since many things
# need to modify it when the system is running. Not needed on systems
# running udev.
#/dev

# Necessary if the system is to be able to send mail while in embedded
# mode.
/var/spool
/var/mail

and...

Code:
root@PlugPBX:/etc/flashybrid# cat ramtmp
# This is a list of directories that should be set up as temporary
# directories in the ramdisk. On boot, the directories listed will be
# created on the ramdisk, mode 1777. Their contents will not be saved
# across reboots. The equivilent directories on your flash will be
# removed and replaced with symlinks to these directories.

# Do not put the trailing slash on directories!


/tmp
/var/lock
/var/lib/alsa
/var/lib/dhcp
/var/lib/dhcp3
/var/lib/php4
/var/lib/php5
/var/lib/misc
/var/lib/ntp
/var/lib/urandom
/var/lib/apache

# This could go in ramstore if you prefer for its contents to persist
# across reboots.
/var/tmp

This looks promising..

« Last Edit: February 17, 2010, 03:30:51 PM by PlugPBX » Logged

mundhra
Newbie
*

Karma: 1
Posts: 36


View Profile
« Reply #13 on: February 17, 2010, 04:38:11 PM »

i have a feeling something is locked but not actively being written to(?) that isn't already in the ramdisk lists.

i used the sheeva with linux text kernel 31.6 with the transactions and iotop said no active writes were occurring. so much for finding some obvious red flag. Undecided

i guess expanding the ramdisk to 100M and adding all of /var for a test shouldn't be too hard...
Logged

PlugPBX
Newbie
*

Karma: 4
Posts: 22


View Profile
« Reply #14 on: February 17, 2010, 09:21:13 PM »

Its do-able... I just cut all the fingers off slowly until it gave in (I think rsyslogd was the culprit ...have to look at its mappings ...)

Code:
root@PlugPBX:/etc/flashybrid# mountro
root@PlugPBX:/etc/flashybrid# mount
/dev/mmcblk0p3 on / type ext2 (ro,errors=remount-ro)
..snip

So with a bare bones test system I kept shuting down tasks from looking at what lsof reported as being open. The way flashybrid works is that you have to explicitly specify each end point. Its scripting thats creating mounts OVER TOP the root file system. Those mounts will then be allocated to the ram disk storage. The clever scripting takes care of syncing the root flash disk to the ram disk at bootup, and vice versa during shutdown (or you can do so on demand via the fh-sync script they have in /sbin. You could have a cron sync once a week etc if you are paranoid.

So don't expect to just say /var and call it a day. That makes a big mess as you'd loose everything (some odd limitations to the scripting perhaps).

Basically it looks like you have to map everything directory wise that would create write fopen() on your root file-system at least past the first recursion level (can't go just /var but /var/logs is okay) I think. Its odd that it handles /etc fine though. Must look at the scripting more (

It does work, but boy would it be nice to have a script that could just give me a list of open files with write locks on them *sigh* - bonus points it would be able to auto generate this config file for flashybrid. This is an idea tool for Debian PlugComputer users to create embedded style systems that have read only file systems, but can fall back and save like a read/write file system.

Makes SD card root systems super simple. Done right a system can boot read only, run, survive power interruptions, but have changes / updates synced back to the read-only boot volume super easy.

I need to play more with this tomorrow.
Logged

Pages: [1] 2 3
Print
Jump to: