• Home
  • Help
  • Search
  • Login
  • Register
Pages: [1]
Author Topic: Nand flash and "bad" blocks  (Read 6871 times)
superpat
Full Member
***

Karma: 15
Posts: 141


View Profile
« on: April 17, 2009, 03:34:27 AM »

Hi,

According to the Sheeva Bom the NAnd flash chips are (or can be) Samsung K9F4G08U0A-PCB0, although I have seen people showing Hynix ICs on this forum!

Looking at the K9F4G08U0A-PCB0  data sheets, reveals the following gems:-

/quote

VALID BLOCK
NOTE :
1. The device may include initial invalid blocks when first shipped. Additional invalid blocks may develop while being used. The number of valid blocks is
presented with both cases of invalid blocks considered. Invalid blocks are defined as blocks that contain one or more bad bits. Do not erase or program
factory-marked bad blocks. Refer to the attached technical notes for appropriate management of invalid blocks.
2. The 1st block, which is placed on 00h block address, is guaranteed to be a valid block up to 1K program/erase cycles with 1bit/512Byte ECC.
3. The number of valid block is on the basis of single plane operations, and this may be decreased with two plane operations.

* : Each K9F4G08U0A chip in the K9K8G08U1A has Maximun 80 invalid blocks.

Parameter Symbol Min Typ. Max Unit
K9F4G08U0A NVB 4,016 - 4,096 Blocks
K9K8G08U1A NVB 8,032* - 8,192* Blocks

AND :-

NAND Flash Technical Notes
Identifying Initial Invalid Block(s)
Initial Invalid Block(s)
Initial invalid blocks are defined as blocks that contain one or more initial invalid bits whose reliability is not guaranteed by Samsung.
The information regarding the initial invalid block(s) is called the initial invalid block information. Devices with initial invalid block(s)
have the same quality level as devices with all valid blocks and have the same AC and DC characteristics. An initial invalid block(s)
does not affect the performance of valid block(s) because it is isolated from the bit line and the common source line by a select transistor.
The system design must be able to mask out the initial invalid block(s) via address mapping. The 1st block, which is placed on
00h block address, is guaranteed to be a valid block up to 1K program/erase cycles with 1bit/512Byte ECC.
All device locations are erased(FFh) except locations where the initial invalid block(s) information is written prior to shipping. The initial
invalid block(s) status is defined by the 1st byte in the spare area. Samsung makes sure that either the 1st or 2nd page of every
initial invalid block has non-FFh data at the column address of 2048. Since the initial invalid block information is also erasable in
most cases, it is impossible to recover the information once it has been erased. Therefore, the system must be able to recognize the
initial invalid block(s) based on the original initial invalid block information and create the initial invalid block table via the following
suggested flow chart(Figure 3). Any intentional erasure of the original initial invalid block information is prohibited.

/endquote

The interesting statistic is that there can be up to 80 invalid blocks in each chip allowable as a "good" part by Samsung. 

The first block is guaranteed to be good for 1000 program/erase cycles.

Here is the general spec:-

FEATURES
Voltage Supply
- 2.70V ~ 3.60V
Organization
- Memory Cell Array : (512M + 16M) x 8bit
- Data Register : (2K + 64) x 8bit
Automatic Program and Erase
- Page Program : (2K + 64)Byte
- Block Erase : (128K + 4K)Byte
Page Read Operation
- Page Size : (2K + 64)Byte
- Random Read : 25μs(Max.)
- Serial Access : 25ns(Min.)
512M x 8 Bit / 1G x 8 Bit NAND Flash Memory
Fast Write Cycle Time
- Page Program time : 200μs(Typ.)
- Block Erase Time : 1.5ms(Typ.)
Command/Address/Data Multiplexed I/O Port
Hardware Data Protection
- Program/Erase Lockout During Power Transitions
Reliable CMOS Floating-Gate Technology
-Endurance : 100K Program/Erase Cycles(with 1bit/512Byte
ECC)
- Data Retention : 10 Years
Command Driven Operation
Intelligent Copy-Back with internal 1bit/528Byte EDC
Unique ID for Copyright Protection
Package :
- K9F4G08U0A-PCB0/PIB0
48 - Pin TSOP I (12 x 20 / 0.5 mm pitch)
- K9F4G08U0A-ICB0/IIB0
52 - Pin ULGA (12 x 17 / 1.00 mm pitch)
- K9K8G08U1A-ICB0/IIB0
52 - Pin ULGA (12 x 17 / 1.00 mm pitch)

You can see the endurance, (life)  is -Endurance : 100K Program/Erase Cycles(with 1bit/512Byte
ECC)

You can get your own copy of the data sheet from many web sites  (alldatasheets is a good one)

There is also  interesting reads about NAND flash on the MTD website.

EDIT

The MTD website   www.linux-mtd.infradead.org   does say that NAND flash  CRITICAL of  interrupts to its operating cycle. 

Sudden unexpected power fails or shutdowns may appear as  bad block errors 

AFAICS  The whole NAND flash page has to be written in one go from a buffer elsewhere in the system, if the power fails ungracefully half way through writing the page  then the page is left with a CRC error.

(I think)

Which may explain the problems some people are seeing

regards

Patrick



« Last Edit: April 17, 2009, 03:57:26 AM by superpat » Logged

plugit
Global Moderator
Full Member
*****

Karma: 0
Posts: 139



View Profile
« Reply #1 on: April 17, 2009, 07:42:48 AM »

AFAICS  The whole NAND flash page has to be written in one go from a buffer elsewhere in the system, if the power fails ungracefully half way through writing the page  then the page is left with a CRC error.

That's interesting. Seems like a charactaristic you wouldn't really want in a wall-wart style device! It's going to happen, lots.
Logged

superpat
Full Member
***

Karma: 15
Posts: 141


View Profile
« Reply #2 on: April 17, 2009, 09:16:21 AM »

@plugit

I would have assumed that the sheevaplug would have a power conditioning integrated circuit fitted, which would monitor the power rails, and would raise a power unsafe flag, which the cpu would  use to take the appropiate action and run a power down sequence and stop the system gracefully before the voltage level dropped too low.

(Thats the way we used to do it when I was in lecky design, many, many, moons ago)

How the Plug does it  I dunno, I will have a look at the schematic and see if there is a inbuilt power monitor etc.

I would have thought it would be essential in a power plug type device!

cheers

Patrick
Logged

plugit
Global Moderator
Full Member
*****

Karma: 0
Posts: 139



View Profile
« Reply #3 on: April 17, 2009, 09:52:02 AM »

Hmmm... Perhaps disabling atime may be helpful for minimizing damage?
Logged

superpat
Full Member
***

Karma: 15
Posts: 141


View Profile
« Reply #4 on: April 17, 2009, 11:11:26 AM »

@plugit

This is something different from normal wearout.

atime  (and swap files) and logging and journaling file systems are all bad news in flash memory. I don't think you need to get too paranoid about them but it is obviously better for the flash to do the minimum of write cycles.

If the linux mtd driver wear leveling and bad block algorythms (spell?)  and ecc correction are working correctly, then you should NEVER see any failures due to bad cells.  What normally happens is that if a cell goes bad the bock is marked bad and added to the bad block list. What you see is a flash memory that gradually over months or years getting smaller and smaller in usuable capacity.

I have had a bit of a read in the flash chip spec sheet, There is this caveat:-

/quote

Data Protection & Power up sequence

The device is designed to offer protection from any involuntary program/erase during power-transitions. An internal voltage detector
disables all functions whenever Vcc is below about 2V. WP pin provides hardware protection and is recommended to be kept at VIL
during power-up and power-down. A recovery time of minimum 100μs is required before internal circuit gets ready for any command
sequences as shown in Figure 21. The two step command sequence for program/erase provides additional software protection.

/endquote


Now I have had a look at the plug schmatic  The WP pin on the chip is hardwired to 3.3V rail cia a 4K7 resistor.   So there is no H/W protection at the chip.  If the 3.3v is flaky then yuk!

The protection scheme must be done in the Sheeva cpu and the S/W.   There is a  reset signal generated on one of the sheeva GPIO pins that is used to reset (or set) U8 U9 U10 switching regs, there is also a 40ms delay chain between each of the regulators.   

However I cannot find any info on the Sheeva chip in the Marvell data sheet that indicates that the chip has any inbuilt power monitoring, and I cannot see any external power monitoring  analogue or A-D fed to the sheeva.

The schmatics are very large and are difficult to read on my monitor. It would be good to have a functional  schamtic (block diagram) of the plug itself, that would make the power sequencing easier to follow.

cheers

P

 
Logged

plugit
Global Moderator
Full Member
*****

Karma: 0
Posts: 139



View Profile
« Reply #5 on: April 17, 2009, 01:07:56 PM »

I'm enjoying reading the results of your research - very illuminating, thanks!

So: it sounds like this device may be useful as-is if mouted readonly, with perhaps a read-write partition for saving configuration changes (if there are any) to NAND. But with a typical caveat of "do not power down this device ... etc.". Certainly sounds like using the NAND as a live filesystem is (maybe) dangerous ground.

I really do think my root filesystem is degrading over time. S'okay for me, because I've abandoned it in favor of a USB drive, but I'd far rather have a small NAND filesystem and mount the drive under /usr/local, in order to be able to power down / remove / replace the drive independant of device uptime.

If these assumptions about the NAND are right on, the SheevaPlug is sitting in a weird limbo place somewhere between a read-only appliance and a server, too powerful for the one and too unstable for the other.

Curious.
Logged

tschaboo
Newbie
*

Karma: 0
Posts: 15


View Profile
« Reply #6 on: June 07, 2009, 11:56:47 AM »

Hello plugit and superpat,

i've had a lot of time between ordering and delivery of my sheeva to read a lot about NAND-flashes too. I want to address two aspects:

1.) The problem with half-written blocks isn't really a problem. When a jffs2 filesystem is mounted, every block is checked, the ECC-error will be detected. Data which is okay, will be copied to another eraseblock. After that the block isn't marked as bad immediately, but checked up to two times [1]. So you won't "loose" your block, just because of a power-failure. UBIFS/UBI have similar precautions [2].

Also the behavior visible to the user will be the same as with any other (journalled) filesystem on a normal harddisk: if you power down, while stuff isn't still written to the device, it will be lost, but the filesystem will be in a consistent state (at least with UBIFS, i don't know for sure with jffs2). If you want atomic transactions, you can get them by using the standard POSIX-procedure: write to a new file, flush, fsync, rename. Again the same as with ext3/xfs/etc...

2.) Regarding atime and journalled file-systems on FTL-devices (this doesn't matter for the sheeva-nand of course):
While i'd turn off atime if i don't need it, simply because it's a waste of write-cycles i don't really see a problem, as long as your FTL is really good (like in intel's X25-E/M). Suppose the journal is committed every 5 seconds, and every time one page, say 2KiB, is written because of that. That would be 12GiB per year, which would be 0.15 erase-cycles on an intel 80GB-SSD or 24 cycles on a 512MB device. Nothing to really worry about. Of course, if the wear levelling is not done well, that calculation won't work.

On raw nand that's not a problem at all. If you're using UBIFS you get your journal for free, because UBIFS writes a full journal (with data) and later updates the filesystem-tree to point directly to that data in the (former) journal. The acual journal just moves forward. So data isn't written twice because of the journal. It's similar to a log-structured filesystem [3].

[1] "How is ensured, that data is written to flash?" and "How does JFFS2 handle a block going bad in NAND flash?" at http://www.linux-mtd.infradead.org/faq/jffs2.html
[2] http://www.linux-mtd.infradead.org/faq/ubifs.html#L_powercut
[3] http://en.wikipedia.org/wiki/Log-structured_filesystem
« Last Edit: June 07, 2009, 11:59:13 AM by tschaboo » Logged

Pages: [1]
Print
Jump to: