`
`^ . oi^^^r^,o . i
`
`i ,) UnUrd Statr,
`(12)PatenlApplicalionPublicalion
`cxirveiyi.
`
`(54) NON-VOLATILE MEMORY DATA STORAGE
`SYSTEM WITH RELIABILITY
`MANAGEMENT
`
`(75)
`
`Inventors:
`
`ROGER CHIN, San Jose, CA
`(US); Gary Wu, Fremont, CA (US)
`
`Correspondence Address:
`Tung & Associates
`Suite 120
`838 W. Long Lake Road
`Bloomfield Hills, MI 48302 (US)
`
`(73)
`
`Assignee:
`
`Nanostar Corporation, U.S.A
`
`(21)
`
`Appl. No.:
`
`12/471,430
`
`(22)
`
`Filed:
`
`May 25, 2009
`
`Related U.S. Application Data
`
`(63)
`
`Continuation-in-part of application No. 12/218,949,
`filed on Jul. 19, 2008, Continuation-in-part of applica-
`tion No. 12/271,885, filed on Nov. 15, 2008, Continu-
`ation-in-part of application No. 12/372,028, filed on
`Feb. 17, 2009.
`
`(10)
`(43) viiu.oei...
`
`dan
`
`31 5 iu10
`
`Publication Classification
`
`'
`
`Int. Cl.
`G06F 12/16
`G06F 12/02
`G06F 13/28
`G06F 11/14
`G06F12/08
`
`(2006.01)
`(2006.01)
`(2006.01)
`(2006.01)
`(2006.01)
`
`............... 714/6; 711/103; 710/22; 711/114;
`U.S. Cl.
`711/206; 711/118; 711/E12.008; 711/E12.103;
`711/E12.017;714/E11.127
`
`ABSTRACT
`
`(51)
`
`(52)
`
`(57)
`
`A non-volatile memory data storage system, comprising: a
`host interface for communicating with an external host; a
`main storage including a first plurality of flash memory
`devices, wherein each memory device includes a second plu-
`rality of memory blocks, and a third plurality of first stage
`controllers coupled to the first plurality of flash memory
`devices; and a second stage controller coupled to the host
`interface and the third plurality of first stage controller
`through an internal interface, the second stage controller
`being configured to perform RAID operation for data recov-
`ery according to at least one parity.
`
`100
`Reliability Measurement By
`e.g., MTBF, UBER
`
`Host
`
`Host Interface
`120
`
`kM
`
`CD
`
`i ----
`DMA!
`I/F 401
`
`Contro ller 142
`i -
`DMA1
`IT 403
`
`DMA!
`I/F 402
`
`DMAI
`I/F 404
`
`XOR
`Engine
`149
`
`r- -
`
`-
`
`^
`-
`ECCI
`EDC/
`BBM/
`WLI ;
`GC
`
`Parity generation
`------^
`^
`
`^^ -- -- 1- ----
`-
`-
`-
`ECC/
`CCC
`ECC/
`EDC/
`EDC/
`EDC ^
`BBM/
`BBM/
`BBM ;
`I
`WIU ;
`WL/
`WLJ
`GC
`GC G^
`
`Al
`131
`Cl
`D1
`
`A2
`132
`C2
`D2
`
`A3
`B3
`C3
`D3
`
`A-parity
`B-parity
`C-parity
`D-parity
`
`Working I
`S pace
`
`Spare-I
`
`5pare-2
`
`Spare-3
`
`5pare-4
`
`i
`
`161
`
`162
`
`163
`
`164
`
`-'
`Distribut
`ed
`Channel
`I
`
`--
`b
`Distri ut
`Channel
`
`d
`e
`2
`
`160
`-
`
`^
`I
`
`Micron Ex. 1030, p. 1
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 1 of 16
`
`US 2010/0017650 Al
`
`0)
`LL
`----------------------------
`
`^
`
`^I
`
`- - - - - - - - - - - - - - - - -
`
`1st Stage
`Controller
`144N
`
`•U Z
`(D
`
`- - - - - - - - - - -
`
`J
`
`f
`
`------------------,
`
`1st Stage
`C
`ton ro erll
`
`
`1441
`
`-------
`
`oa)
`U
`E5;
`0
`-----J
`
`_
`
`C%
`
`0 ^k
`
`0U
`
`2nd Stage
`
`1st Stage
`
`()
`
`Ua
`
`)
`Cn0
`
`I
`
`O0r
`
`4-0Cn0
`
`Micron Ex. 1030, p. 2
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 2 of 16
`
`US 2010/0017650 Al
`
`L.L
`F _- ____- ---_-----
`
`^ U
`
`Q
`
`r
`
`^
`
`I
`
`^
`I
`
`^
`
`6- Q
`0
`
`^
`
`1
`
`I i'---------^
`
`L L
`
`V
`
`V
`
`.
`Q
`fl
`I wwm^^ 9-
`a m ci d
`
`i
`
`Q
`^n
`
`^ ^
`
`----------
`
`wwm
`
`a m
`
`o
`
`----------
`--^----------- --- -------- ----- -- -- d
`
`N
`
`^ N
`
`^ -----------
`
`° ^^
`wmw.. --______-
`
`a m
`
`o
`
`CC
`
`Q
`
`LU
`
`^
`
`0.
`
`al
`
`t
`
`N r
`
`L L_
`^----
`Oi
`^^
`
`o' a
`Ui
`
`Q)
`U
`
`'^
`
`Q) Q
`N
`
`^N0_
`
`^C
`
`n02
`
`^ c
`
`O E w^QI
`
`r-
`LC
`CO CO
`
`i----
`
`--#----------- --- --------- ---- -- -...
`--^----------- --- -------- ----- - - --3I
`
`^i
`
`-----------
`
`f
`
`,
`
`Q m U 0
`
`a
`ih
`
`----------
`
`^, C
`
`Q
`
`^
`
`-_^-------_._--
`I - - - - -
`
`---
`----- ------
`- -
`- - - - - - -
`
`;
`
`- I
`
`ca ^^
`
`2nd Stage
`
`19t Stage
`
`Micron Ex. 1030, p. 3
`Micron v. Vervain
`IPR2021-01550
`
`q
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 3 of 16
`
`US 2010/0017650 Al
`
`CN
`
`6 FL
`
`1
`
`z C0^
`
`z4-1
`
`E
`
`U ^
`
`z
`
`to
`
`r
`
`a n n n n m n
`
`Nca0
`
`)
`a^
`
`/
`
`^^
`
`03
`
`\
`
`^1(
`
`D
`
`^
`
`.. . ...........
`
`.......
`
`Z ^ U
`^
`J
`
`^ L
`
`•-1 o^
`T c
`0
`
`ai
`^
`E
`Cn
`
`I
`
`. .......... ..
`
`.......
`
`....................... ....I
`
`U.......
`
`..
`
`I
`
`^
`
`.
`
`.
`
`cs
`
`.......... .. .......
`
`
`
`•...... ........
`
`T. ^
`
`.. ...
`
`I i
`
`.
`. ... . ...
`
`.
`
`Z
`
`W ^^
`
`I
`
`^
`
`n ^ a n n n n
`
`....................... ....
`
`...... .^...
`
`^ r U
`J
`
`T ^
`
`M
`
`E
`^
`
`O
`
`CO L^,
`
`I
`
`w
`
`0
`
`(n O
`
`r O
`U ^
`
`^
`
`..
`
`.......... ..
`
`.. ..... .
`
`N
`
`Q)
`
`0 0 0H
`
`La)
`5
`
`O0 0H
`
`Micron Ex. 1030, p. 4
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 4 of 16
`
`US 2010/0017650 Al
`
`^
`
`....
`LL
`
`coa-
`(!^
`CD
`
`^
`
`Um
`
`CD
`
`O
`
`^
`
`a
`
`a a ^^^. a a^. I
`
`I^------------------------^^
`
`M
`¢ m c^
`
`M
`c9 =
`li-------------------------i
`
`.•,w.•.
`
`LL
`
`Q m U
`
`W i^i
`
`r_ N
`
`I
`
`I
`
`(D
`----- -----------------,I
`....
`a m.'.ci.'.
`
`r
`
`u.
`
`T T
`
`.
`
`07
`
`0 ^
`
`7
`
`a a a
`
`0. a a
`
`e.
`
`fl.
`
`C.
`
`I
`
`Ir-- ^-- --------.-----------il
`•..
`[7,,
`
`^y
`
`M M
`
`M M M
`
`------------^...,...---
`
`---
`
`a m v
`
`o w i
`
`r------------------------^I
`'•'•'•'•
`Q m.aU..
`
`W tL
`
`(7 2
`
`^^- - --
`
`_ _ - - ^ ^--- I
`
`.
`
`CI4
`^
`
`r
`
`M
`
`N
`
`^
`
`r
`C
`
`qa ^
`
`C)
`
`o
`
`_0
`
`a-+
`^ V
`a3 0
`q m
`
`Micron Ex. 1030, p. 5
`Micron v. Vervain
`IPR2021-01550
`
`q
`q
`q
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 5 of 16
`
`US 2010/0017650 Al
`
`fl.
`
`a
`
`a
`
`I
`
`CO
`
`oCL
`
`M
`
`1-
`
`o
`a)
`
`U
`
`Q
`
`^
`
`LL
`
`C,^
`
`Q
`
`C.
`
`Q
`
`CL I
`
`I^-----------------------i^
`
`NI^
`
`r.
`
`a
`
`^------------- ---------w-^I
`
`a
`
`r-L
`
`^
`
`Va
`
`n m
`
`I
`
`0 4 .2
`0 w^
`M
`
`Micron Ex. 1030, p. 6
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 6 of 16
`
`US 2010/0017650 Al
`
`LO
`
`^
`
`a)
`CS^
`
`U(
`
`^ _
`
`Ile
`0
`
`Q}
`U
`
`Q-
`
`^C
`
`Q
`
`I
`
`i
`
`I
`
`-
`WE --ME
`
`co
`
`2"d defect
`
`clefect
`
`m `
`
`^ '
`
`C
`
`a
`
`W Cn m
`
`I
`
`^
`sL
`
`mE
`
`fr ^.
`
`ll-
`ii T
`------------- ---
`
`a w U-
`=
`r -------------------il
`
`L
`
`N
`I..L
`
`h
`
`m
`
`^^/
`
`LL
`
`•^
`
`Q
`
`ax
`
`0^ tn r!
`
`^
`
`m H
`
`^
`
`•::•:,:
`M .i?^.
`a w : :k ?: :
`
`M
`
`M
`^r w
`
`M M
`[7 x
`
`Q-
`
`----------------•--------I
`
`C10
`
`L
`M
`Q
`
`.1-i
`cV
`
`: .Cl.'. Q
`
`N N
`
`- N -N ---- r -N- N ----
`
`----•_
`
`Q m ^
`
`w tL
`
`Q = ^
`
`r -.....---------------------i
`
`C' w LL
`
`.. .. -
`
`a)
`V
`
`^ Q
`O CO
`
`Micron Ex. 1030, p. 7
`Micron v. Vervain
`IPR2021-01550
`
`q
`q
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 7 of 16
`
`US 2010/0017650 Al
`
`C- m
`
`o
`
`C-
`
`U_
`
`Q
`
`e'
`
`C)
`
`W
`
`a'
`
`^
`a v
`
`« m
`
`r• o
`
`^ CO
`
`U)
`
`OJ
`
`^ o
`
`CLa
`
`^.mmE
`
`U
`
`ir O
`
`r-
`
`----------- ---
`
`ft[ Hm L L I HH L1I ]
`
`L
`
`0 r-
`
`Q
`
`Ir-------------
`
`Q m
`
`fl'
`
`......:
`
`C4
`
`!2.
`
`---^
`
`W
`
`^
`
`J
`
`a)
`
`N
`W
`
`LL
`
`_Ilero
`::T H
`
`^
`
`C-
`
`H I1II
`
`21 a ^w CL
`--..._.._.-------- ---I
`U^ !^ N N
`a m
`o
`
`r :'a11:: N
`^
`
`a
`
`I
`
`^ r
`U^
`
`^,
`
`T T .•^':,
`
`r r
`
`T-
`
`.
`
`- -----7---^
`--
`-- - ---
`
`Micron Ex. 1030, p. 8
`Micron v. Vervain
`IPR2021-01550
`
`q
`q
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 8 of 16
`
`US 2010/0017650 Al
`
`_^ _-
`
`F
`
`r r E iT1
`
`u,
`
`t7
`
`.
`
`^
`
`0'
`
`CL
`
`a'
`
`°^
`
`.....- ---------
`
`---^
`
`^
`
`Nif Q a a
`
`o:.:::
`...
`
`Cr
`
`a
`^
`
`^
`
`-
`
`M
`Q
`
`p)
`
`0- m v
`
`a ^ u
`
`N
`
`0
`
`a ^
`
`t^
`
`------
`
`a
`
`Cr
`
`CO
`
`2 N
`
`:L1 1
`
`---
`
`^
`
`W U-
`
`o,
`
`-7---^
`
`- -
`
`020
`
`m-
`
`0 -r--
`(-D 2
`0
`4-
`N
`
`Micron Ex. 1030, p. 9
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 9 of 16
`
`US 2010/0017650 Al
`
`a, y
`o ^^
`E -0 co0
`
`^ O c^Q
`
`Q
`
`Co
`
`LL
`
`------- --- - --
`i
`
`BUF 41
`
`BUF42
`
`^ O
`°LL
`
`I
`
`V
`U
`^ m ^
`iwwm O
`
`+
`^
`
`p^
`
`NC
`
`'4
`
`A
`
`O
`
`C 0
`
`U
`
`o
`
`^
`
`NU
`
`FE
`
`(n
`
`O _
`
`CD
`
`BUF 31
`
`^L Q
`
`VC)^^
`3 V 0 m^^
`w wm
`
`0-5
`
`0 0^ r
`
`BUF 32
`------ --_-- ------------- -- ---- ---
`N
`
`I -
`
`^
`
`c
`
`i
`
`BUF21
`
`B
`
`v,I
`
`LL
`
``
`wm
`
`i
`
`-h
`
`0
`
`N
`
`r ^ (.)
`
`^
`
`BUF11
`
`20
`BUF12 °i'LL
`--
`----- ----
`
`--
`
`o m
`0
`wwm
`
`t
`---"'
`
`-j- 0 :
`
`0
`E '^*0
`
`--- ------- --
`
`^J
`-^
`
`N
`
`L)
`
`L...,._,..---- --- ---
`
`Micron Ex. 1030, p. 10
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 10 of 16
`
`US 2010/0017650 Al
`
`^
`
`H..
`
`0co
`
`^
`
`O
`
`N
`
`a
`
`n
`
`n
`
`?, v
`
`E 'cN0220
`
`r
`
`^
`
`j
`
`2q (
`
`,n
`
`. n n
`
`2q U
`
`^
`
`z0 I
`
`q
`
`0 ^
`
`cs
`a}
`
`Q U
`^{a
`q U
`
`T^
`
`SRAM (xN)
`
`^
`
`li
`
`N
`
`^0
`
`^
`(^
`
`+•♦
`
`*
`
`nn
`
`n
`
`^ r
`
`r
`a.
`s
`
`T
`
`• rr
`
`^
`o
`C ^ n
`^ •
`^ ,
`a
`
`••
`
`r
`
`a)
`
`SRAM (xN)
`
`N
`CO
`
`N
`
`T
`
`^U
`
`N
`
`oC
`
`M
`T
`
`Q^
`V
`
`(n
`
`ud.n
`
`.&-a
`U)
`
`0I
`
`Micron Ex. 1030, p. 11
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 11 of 16
`
`US 2010/0017650 Al
`
`C)
`
`LL
`
`0Q
`
`D
`
`r r`
`
`z
`
`E
`Oqj O
`
`^
`
`^
`
`•
`
`^
`
`I^CD
`
`Ca
`
`O
`
`Q
`
`N
`
`UCO
`
`to
`
`^
`
`n i . . . i
`
`U
`0
`
`a N
`
`N
`
`O
`co
`
`q U W-^
`
`^-.
`
`W
`
`U
`U
`
`^,
`
`.
`
`•
`•
`•
`•
`a
`a
`
`•
`
`♦
`
`^
`
`fL
`
`V
`M
`
`.
`
`•
`•
`^
`w
`;
`
`.
`
`c^
`a^
`
`N C
`
`.]
`
`{."
`
`y n . i a
`
`. •
`
`^--^
`
`0N
`
`0mIz-(D
`
`cn
`0_
`
`N0I
`
`^
`
`O
`a--^
`C
`0
`.---- `,
`
`,
`
`N
`
`N
`
`U)
`
`^U0
`
`NN
`
`_
`
`NQ,)
`
`LL
`
`^ ^
`
`U
`
`Micron Ex. 1030, p. 12
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 12 of 16
`
`US 2010/0017650 Al
`
`2:' 0
`
`O Ea
`
`) O
`2 13;
`
`N(
`
`D^ C
`
`Y)
`
`^
`
`O
`
`(D
`L N
`M CO
`0
`
`LL
`
`^
`
`.i•-+
`
`W
`
`CY)
`Wr ^
`
`V J
`
`Micron Ex. 1030, p. 13
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 13 of 16
`
`US 2010/0017650 Al
`
`r
`
`r
`
`r
`
`LL
`
`N3
`
`N
`
`c
`04
`y y ,^,
`
`O U
`
`E
`
`^ w
`
`^ vr
`41 N ^
`E
`
`LU
`
`^
`
`V
`
`o^
`
`^ c
`
`^U
`
`2nd stage
`
`1st stage
`
`Micron Ex. 1030, p. 14
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 14 of 16
`
`US 2010/0017650 Al
`
`CY)
`
`LL
`
`aD
`^
`
`> d
`
`s^
`
`O
`
`Q.
`
`N ^
`
`0z
`
`E
`^a)
`N
`
`'
`
`0)
`^. ^
`
`^ o
`^
`^ N
`CO O
`E
`
`Z
`
`N
`
`0z
`
`coE
`L .^, N
`o ^ m
`m^^ u^
`-6
`^
`
`N
`
`M >
`
`MC
`
`A
`
`to 0
`
`y
`
`^
`
`O [.U N
`p
`E
`o ^
`._
`
`m5
`0^
`0 (D ^
`ro^ E
`
`o U
`
`-
`
`a
`
`O O
`tn
`
`0
`_
`
`Q
`
`^
`
`v
`
`If
`
`^
`
`^
`
`^S
`
`a
`U) C%j
`
`-,e
`
`c N
`0-0
`U L^
`
`02
`U "O v-
`(D
`I N ^ U ^
`Rf ^
`E
`0)^ ^ Q ^ °^
`n)
`^ Q O N ^ C ^
`> O ^ O O N (6
`
`fiS
`^
`
`CY)
`-0
`C +• ^ N E
`CL N CC y
`
`Micron Ex. 1030, p. 15
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 15 of 16
`
`US 2010/0017650 Al
`
`( T^
`X
`
`x
`
`CO
`
`x
`
`X
`
`C'7
`
`x
`
`I
`
`^ ^
`
`^
`_
`^^
`0-o0
`
`J
`
`J
`
`J ^
`
`LL
`
`r (y"'
`X
`
`CO
`
`I
`
`I J
`
`a)
`
`E50
`
`-0
`N Cn
`U V
`
`o
`
`^ U
`
`U
`
`U
`
`U ^
`
`^^
`mm
`^
`m
`
`0
`
`mm
`^
`
`2(4
`
`0
`U
`
`a) U)
`mm
`Cn
`m
`
`0^
`0
`U
`
`U) U)
`mm
`U)
`m
`
`U
`
`L (
`
`cU
`
`Host Interface
`
`Micron Ex. 1030, p. 16
`Micron v. Vervain
`IPR2021-01550
`
`
`
`Patent Application Publication
`
`Jan. 21, 2010 Sheet 16 of 16
`
`US 2010/0017650 Al
`
`LO
`
`r L
`
`.L
`
`^ ------------- --
`
`I ^
`
`T--
`
`o x
`
`I
`.. z
`CO 0U ^
`Tog
`
`L Z^°^^- a -j
`
`eo
`
`CL
`v^
`
`Q
`vA
`
`N
`d1
`a
`^I)
`
`a
`N
`
`____---_
`
`Ut^^--I V
`W LLJ m
`.____------
`. ----------
`` ^
`
`>%
`
`a
`
`CI)
`
`0
`
`m
`
`::1
`U © m
`U
`WliJm^:C7 Q Q V
`m
`
`U
`
`r
`
`^ NI
`
`O N
`
`-1--^
`
`o
`
`^
`
`Q
`
`N02
`
`o
`
`'^
`
`Q
`
`21+
`
`LQ
`
`U
`
`a m
`
`.----------
`
`U U
`U q m
`I W ^LI CO
`
``----------
`.----------
`
`UDOa^C9
`wuaoo
`'•_--...._---_
`
`a m
`
`L)
`
`L Z
`
`:
`C7 r
`-------- ^-----._.--j
`
`0r
`
`Micron Ex. 1030, p. 17
`Micron v. Vervain
`IPR2021-01550
`
`
`
`US 2010/0017650 Al
`
`Jan. 21, 2010
`
`NON-VOLATILE MEMORY DATA STORAGE
`SYSTEM WITH RELIABILITY
`MANAGEMENT
`
`CROSS REFERENCE TO RELATED
`APPLICATIONS
`
`The present invention is a continuation-in-part
`[0001]
`application of U.S. Ser. No. 12/218,949, filed on Jul. 19,
`2008, ofU.S. Ser. No. 12/271,885, filed onNov. 15,2008, and
`of U.S. Ser. No. 12/372,028, filed on Feb. 17, 2009.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`[0002]
`The present invention relates to a non-volatile
`[0003]
`memory (NVM) data storage system with reliability manage-
`ment, in particular to an NVM data storage system which
`includes a main storage of, e.g., solid state drive (SSD), or
`memory card modules, in which the reliability of the stored
`data is improved by utilizing distributed embedded reliability
`management in a two-stage control architecture. The system
`is preferably configured as RAID-4, RAID-5 or RAID-6 with
`one or more remappable spare modules, or with one or more
`spare blocks in each module, to further prolong the lifetime of
`the system.
`[0004]
`2. Description of Related Art
`[0005]
`Memory modules made of non-volatile memory
`devices, in particular solid state drives (SSD) and memory
`cards which include NAND Flash memory devices, have
`great potential to replace hard disk drives (HDD) because
`they have faster speed, lower power consumption, better rug-
`gedness and no moving parts in comparison with HDD. A
`data storage system with such flash memory modules will
`become more acceptable if its reliability quality can be
`improved, especially if the endurance cycle issue of MLCxr,
`(N=2, 3 or 4, i.e. multi-level cell with 2 bits per cell, 3 bits per
`cell and 4 bits per cell) is properly addressed.
`[0006]
`One of the major failure symptoms affecting the
`silicon wafer yield of NAND flash devices is the reliability
`issue. By providing a data storage system with better capa-
`bility of handling reliability issues, it does not only improve
`the quality of the data storage system but can also increase the
`wafer yield of flash devices. The utilization rate out of each
`flash device wafer can be greatly increased, since the system
`can use flash devices that are tested out with inferior criteria.
`[0007]
`As the process technology for manufacturing
`NAND flash devices keeps advancing and the die size keeps
`shrinking,
`the value of Mean-Time-Between/To-Failure
`(MTBF/MTTF) of the NAND-flash-based SSD system
`decreases and the value of Uncorrectable-Bit-Error-Rate
`(UBER) increases. The typical SSD UBER is usually one
`error for 1015 bits read.
`[0008]
`Another aspect that affects reliability characteristics
`of the flash-based data storage system is write amplification.
`The write amplification factor (WAF) is defined as the data
`size written into a flash memory versus the data size from
`host. For a typical SSD, the write amplification factor can be
`30 (i.e., 1 GB of data that are written to the flash causes 30 GB
`of program/erase cycles).
`[0009] A data storage system with good reliability manage-
`ment is capable of improving MTBF and UBER and reducing
`WAF, while enjoys the cost reduction resulting from shrunk
`
`die size. Thus, a data storage system with good reliability
`management is very much desired.
`
`SUMMARY OF THE INVENTION
`
`In view of the foregoing, an objective of the present
`[0010]
`invention is to provide an NVM data storage system with
`distributed embedded reliability management in a two stage
`control architecture, which is in contrast to the conventional
`centralized single controller structure, so that reliability man-
`agement loading can be shared among the memory modules.
`The reliability quality of the system is thus improved.
`[0011]
`Two important measures of reliability for flash-
`based data storage system are MTBF and UBER. ECC/EDC,
`BBM, WL and RAID schemes are able to improve the reli-
`ability of the system, and thus improve the MTBF and UBER.
`The present invention proposes several schemes to improve
`WAF and other reliability factors; such schemes include but
`are not limited to (a) distributed channels, (b) spare block in
`the same or a spare module for recovering data in a defected
`block, (c) cache scheme, (d) double-buffer, (e) reconfigurable
`RAID structure, and (f) region arrangement by different types
`of memory devices. In the distributed channels architecture,
`preferably, each channel includes a double-buffer, a DMA, a
`FIFO, a first stage controller and a plurality of flash devices.
`This distributed channel architecture will minimize the
`unnecessary writes into flash devices due to the indepen-
`dently controlled write for each channel.
`[0012]
`To improve reliability of the data storage system, the
`system is configured preferably by RAID 4, RAID-5 or
`RAID-6 and has recovery and block repair functions with
`spare block/module. The once defected block is replaced by
`the spare block, either in the same memory module or in a
`spare module, with the same logical block address but
`remapped physical address.
`[0013]
`More specifically, the present invention proposes an
`NVM data storage system comprising: a host interface for
`communicating with an external host; a main storage includ-
`ing a first plurality of flash memory devices, wherein each
`memory device includes a second plurality of memory
`blocks, and a third plurality of first stage controllers coupled
`to the first plurality of flash memory devices; and a second
`stage controller coupled to the host interface and the third
`plurality of first stage controller through an internal interface,
`the second stage controller being configured to perform
`RAID operation for data recovery according to at least one
`parity.
`Preferably, in the NVM data storage system, the first
`[0014]
`plurality of flash devices are allocated into a number of dis-
`tributed channels, wherein each channel includes one of the
`first stage controllers and further includes a DMA and a
`buffer, coupled with the one first stage controller in the same
`channel.
`Preferably, in the NVM data storage system, the
`[0015]
`controller maintains a remapping table for remapping a
`memory block to another memory block.
`[0016]
`Preferably, the NVM data storage system further
`comprises an additional, preferably detachable, memory
`module which can be used as swap space, cache or confined,
`dedicated hot zone for frequently accessed data.
`[0017]
`Preferably, each channel of the NVM data storage
`system comprises
`a
`double-buffer. The double-buffer
`includes two SRAM buffers which can operate simulta-
`neously.
`
`Micron Ex. 1030, p. 18
`Micron v. Vervain
`IPR2021-01550
`
`
`
`US 2010/0017650 Al
`
`Jan. 21, 2010
`
`2
`
`Also preferably, the NVM data storage system
`[0018]
`implements a second stage wear leveling function. The sec-
`ond wear leveling is performed across the memory modules
`("globally"). The main storage is divided into a plurality of
`regions, and the controller performs the second stage wear
`leveling operation depending on an erase count associated
`with each region. The system maintains a second wear level-
`ing table which includes the address translations between the
`logical block addresses within each region and the logical
`block addresses of the first stage memories.
`[0019]
`In another aspect, the present invention discloses an
`NVM data storage system which comprises: a main storage
`including a plurality of memory modules, wherein the data
`storage system performs a reliability management operation
`on each of the plurality of memory modules individually; and
`a controller coupled to the main storage and configured to
`perform at least two kinds of RAID operations for storing data
`according to a first and a second RAID structure, wherein data
`is first stored in the main storage according to the first RAID
`structure, e.g., RAID-0 or RAID-1 and is reconfigurable to
`the second RAID structure such as RAID-4, 5 or 6.
`[0020]
`In another aspect, the present invention discloses an
`NVM data storage system which comprises: a host interface
`for communicating with an external host; a main storage
`including a plurality of memory modules, wherein the data
`storage system performs a distributed reliability management
`operation on each of the plurality of memory modules indi-
`vidually, the reliability management operation including at
`least one of error correction coding, error detection coding,
`bad block management, wear leveling, and garbage collec-
`tion; and a controller coupled to host interface and to the main
`storage, the controller being configured to perform RAID-4
`operation for data recovery
`[0021]
`In another aspect, the present invention discloses an
`NVM data storage system which comprises: data storage
`system comprising: a main storage including a plurality of
`flash devices divided into a plurality of channels; a controller
`configured to reduce erase/program cycles of the main stor-
`age; a memory module coupled to the controller and serving
`as cache memory; wherein reliability management operations
`including error correction coding, error detection coding, bad
`block management and wear leveling are performed on each
`channel individually.
`[0022]
`It is to be understood that both the foregoing general
`description and the following detailed description are pro-
`vided as examples, for illustration rather than limiting the
`scope of the invention.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The foregoing and other objects and features of the
`[0023]
`present invention will become better understood from the
`following descriptions, appended claims when read in con-
`nection with the accompanying drawings.
`[0024]
`FIG. 1A illustrates a non-volatile memory data stor-
`age system with reliability management in a two stage control
`architecture according to the present invention. The system
`includes a host interface, a controller, and a main storage
`including multiple memory modules.
`[0025]
`FIG. 1B shows an embodiment with distributed
`channels and distributed embedded reliability management.
`[0026]
`FIG. 2 is a block diagram of the main storage 160
`including regions with different capacity indexes.
`[0027]
`FIG. 3 shows an embodiment of the present inven-
`tion employing RAID-4 configuration.
`
`FIG. 4 shows an embodiment of the present inven-
`[0028]
`tion employing RAID-5 configuration, with a spare module.
`[0029]
`FIG. 5 shows an embodiment with block-level
`repair and recovery functions.
`[0030]
`FIG. 6 shows an embodiment with block-level
`repair and recovery functions, wherein a memory module
`reserves one or more spare blocks to repair a defected block in
`the same memory module. A remapping table shows the
`remapping information for the defected blocks.
`[0031]
`FIG. 7 shows an embodiment of the present inven-
`tion employing RAID-6 configuration, wherein a memory
`module reserves one or more spare blocks to repair a defected
`block in the same memory module.
`[0032]
`FIG. 8 shows an embodiment of the present inven-
`tion which includes a memory module which is used as a
`swap space or cache. The memory module can be detachable.
`[0033]
`FIG. 9 illustrates that the cache 180 stores the ran-
`dom write data to reduce the Write Amplification Factor
`(WAF). The dual-buffer store the sequential write data and
`also store the data flush from the cache 180 before storing
`these data to the main storage 160.
`[0034]
`FIG. 10 shows the data paths of read hit, read miss,
`write hit, and write miss,
`[0035]
`FIG. 11 shows the first stage wear leveling tables.
`[0036]
`FIG. 12 shows the address translation for segment
`address, logical block address ID, logical block address and
`physical block address; it also shows the erase/program count
`table for wear leveling.
`[0037]
`FIG. 13 is a flowchart showing second stage wear
`leveling operation based on the segment erase count.
`[0038]
`FIG. 14 shows a block diagram of an embodiment of
`the system according to the present invention, which includes
`BIST/BISD/BISR
`(Built-In-Self-Test/Diagnosis/Repair)
`functions.
`[0039]
`FIG. 15 shows an embodiment of the present inven-
`tion wherein down-grade or less endurable flash devices are
`used.
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`
`The present invention will now be described in
`[0040]
`detail with reference to preferred embodiments thereof as
`illustrated in the accompanying drawings.
`[0041]
`FIG. 1A shows a NVM storage system 100 accord-
`ing to the present invention, which employs distributed
`embedded reliability management in a two stage control
`architecture (the terms "distributed" and "embedded" will be
`explained later). The reliability management architecture
`according to the present invention provides great benefit
`because good reliability management will not only improve
`the quality of the data and prolong the lifetime of the storage
`system, but also increase the manufacturing yield of flash
`memory device chips in a semiconductor wafer, since the
`number of usable dies increases.
`[0042]
`The system 100 includes a host interface 120, a
`controller 142 and a main storage 160. The host interface 120
`is for communication between the system and a host. It can be
`SATA, SD, SDXC, USB, UFS, SAS, Fiber Channel, PCI,
`eMMC, MMC, IDE or CF interface. The controller 142 per-
`forms data read/write and reliability management operations.
`The controller 142 can be coupled to the main storage 160
`through any interface such as NAND, LBA_NAND,
`BA_NAND, Flash_DIMM, ONFI NAND, Toggle-mode
`NAND, SATA, SD, SDXC, USB, UFS, PCI or MMC, etc.
`
`Micron Ex. 1030, p. 19
`Micron v. Vervain
`IPR2021-01550
`
`
`
`US 2010/0017650 Al
`
`Jan. 21, 2010
`
`The main storage 160 includes multiple memory modules
`161-16N, each including multiple memory devices 61-6N. In
`one embodiment, the memory devices are flash devices,
`which maybe SLC (Single-Level Cell), MLC (Multi-Level
`Cell, usually meaning 2 bits per cell), MLC3(3 bits per cell),
`MLC,, (4 bits per cell) or MLCxs (5 bits per cell) memory
`devices. Preferably, the system 100 employs a two-stage reli-
`ability control scheme wherein each of the memory modules
`161-16N is provided with a first stage controllers 1441-144N
`for embedded first stage reliability management, and the con-
`troller 142 performs a global second stage reliability manage-
`ment.
`Referring to FIG. 1B, the reliability management
`[0043]
`tasks include one or more of error correction coding/error
`detection coding (ECC/EDC), bad block management
`(BBM), wear leveling (WL) and garbage collection (GC).
`The ECC/EDC and BBM operations are well known by one
`skilled in this art, and thus they are not explained here. The
`garbage collection operation is to erase the invalid pages and
`set the erased blocks free. If there is one or more valid pages
`residing in a to-be-erase block, such pages are reallocated to
`another block which has an available space and is not to be
`erased. The wear leveling operation reallocates data which
`are frequently accessed to a block which is less frequently
`accessed. It improves reliability characteristics including
`endurance, read disturbance and data retention. The realloca-
`tion of data in a block causes the flash memory cells to be
`re-charged or re-discharged. The threshold voltages of those
`re-written cells are restored to the original target levels; there-
`fore the data retention and read disturbance characteristics are
`improved. Especially, because the retention quality of the
`MLCx3, MLC,4 flash devices is worse and read disturbance
`thereof is severer than MLCxz flash devices, WL is even more
`important when MLCx3, MLC,4 flash devices are employed
`in the main storage 160. According to the present invention,
`such reliability management operations are performed in an
`embedded fashion, that is, they are performed on each storage
`module individually, at least as a first stage reliability man-
`agement. The controller 142 may perform a second stage
`reliability management across all or some of the storage mod-
`ules.
`The system 100 is defined as having "distributed"
`[0044]
`embedded reliability management architecture because it
`includes distributed channels, each of which is subject to
`embedded reliability management. In FIG. 1B, as an
`example, the main storage 160 includes four distributed chan-
`nels (only two channels are marked for simplicity of the
`drawing), and each channel is provided with a memory mod-
`ule, i.e., the memory modules 161-164. The channels are
`referred as ports also. Each channel is also provided with an
`interface 401-404, preferably including a DMA (Direct-
`Memory-Access, or ADMA, i.e. Advanced-DMA) and a
`FIFO (not shown), in correspondence with each memory
`module 161-164. The ADMA can adopt a scatter-and-gather
`algorithm to increase transfer performance.
`[0045]
`The controller 142 is capable of performing RAID
`operation, such as RAID-4 as shown in FIG. 1B, or other
`types of RAID operations such as RAID-0, 1, 2, 3, 5, 6, etc.
`(For details of RAID, please refer to the parent application
`U.S. Ser. No. 12/218,949.) In RAID-4 structure, the system
`generates a parity for each row of data stored (A-, B-, C-, and
`D-parity), and the parity bits are stored in the same module.
`Preferably, the controller 142 includes a dedicated hardware
`XOR engine 149 for generating such parity bits.
`
`The system 100 has recovery and block repair func-
`[0046]
`tions, and is capable of performing remapping operations to
`remap data access to a new address. There are several ways to
`allow for data remapping, which will be further described
`later with reference to FIGS. 4-7. In FIG. 1B, which is one
`among several possible schemes in the present invention,
`each module 161-164 reserves at least one spare block
`(spare-1 to spare-4) which is not used as a working space. As
`long as a block in the working space is defected, the defected
`block will be remapped by using the spare block in the same
`module (spare-1 in module 161, spare-2 in module 162, etc.).
`The module with the defected block will be repaired and
`function as normal after the remapping; thus the data storage
`system can continue its operations after the repair. The parity
`blocks (A-, B-, C-, and D-parity) can be used for data recov-
`ery and rebuild. More details of this scheme will be described
`later in FIG. 7.
`[0047]
`The main storage 160 can be divided into multiple
`regions in a way as shown in FIG. 2. Each region includes one
`segment in each memory module 161-16N. Each segment
`may include multiple blocks. In this embodiment, as shown in
`FIG. 2, a memory module may include memories of different
`types, i.e., two or more of SLC, MLC, MLCx3 , MLCx4
`memories. It can also include down-grade memories which
`have less than 95% usable density. The memories with the
`best endurance can be grouped into one region and used for
`storing more frequently accessed data. For example, in this
`embodiment, the Region-1 includes SLC flash memories and
`can be used as a cache memory.
`[0048]
`According to the present invention, a capacity index
`is defined for each region. Different region can have different
`capacity index depending on the type of flash memory
`employed by that region. The index is related to endurance
`quality of the flash devices. The endurance specification of
`SLC usually achieves 100 k. The endurance specification of
`MLCxz is 10 k, but it is 2 k for MLC3 and 500 for MLC,4.
`Thus, for example, we can define the capacity index as 1 for
`MLC,4, 4 for MLCx3, 20 for MLCxz and 200 for SLC flash, in
`correspondence to their respective endurance characteristics.
`The capacity index is useful in wear leveling operation, espe-
`cially when heterogeneous regions are employed, with dif-
`ferent flash devices in different regions.
`[0049]
`The main storage 160 is configured under RAID
`architecture. In one embodiment, it can be configured by
`RAID-4 architecture as shown in FIG. 3. In this example the
`main storage 160 includes four modules M1-M4. Each mod-
`ule includes multiple memory devices and each memory
`device includes multiple memory blocks (only three blocks
`per device is shown, but the actual block number is much
`more). The data are written across the modules M1-M3 by
`row, and each row is given a parity (p) which is stored in the
`module M4. Any data lost in a block (i.e., a defected block)
`can be recovered by the parity bits.
`[0050]
`FIG. 4 shows another embodiment. In this embodi-
`ment, The main storage 160 is configured by RAID-5 archi-
`tecture wherein the parity bits (p) are scattered to all the
`memory modules. In this example the main storage 160
`includes four modules M1-M4 and it further includes a hot
`spare module. Each module includes multiple memory
`devices and each memory device includes multiple memory
`blocks (only three blocks per device is shown, but the actual
`block number is much more). The data are written across the
`modules M1-M4 by row, and each row is given a parity (p). In
`case a defected block is found in a module, such as M2 as
`
`Micron Ex. 1030, p. 20
`Micron v. Vervain
`IPR2021-01550
`
`
`
`US 2010/0017650 Al
`
`Jan. 21, 2010
`
`4
`
`shown in the left hand side of the figure, the lost data can be
`recovered with the help by the parity. And as the right hand
`side of the figure shows, the once defected module becomes a
`spare module after the defected module is remapped. A user
`may later replace the once defected module by a new module.
`[0051]
`FIG. 5 shows another embodiment of the present
`invention, which allows block-level repair. In case one or
`more defected (failure) blocks are found, the spare blocks in
`the spare module can be used to rebuild/repair the failing
`blocks, including the parity blocks. The parity (p) can help to
`recover the lost data in the defected block. If the defected
`block is the parity block, the parity can be re-generated and
`rewritten to the spare device. The first column in the remap-
`ping table records the mapping information of the first failure
`block for that row. The second column records the mapping
`information of the second failure block for that same row. In
`the shown example, C1 is the first failure block in the row
`consisting of C1, p, C2, and C3, and E3 is the first failure
`block in the row consisting of El, E2, E3, and p. Thus, the
`remapping table records the information such that any access
`to the original C1 and E3 blocks are remapped to the replacing
`blocks in the spare module. The scheme allows for a second
`failure block in the same row (such as C3), and the remapping
`table records it in the second column.
`[0052]
`In the embodiments shown in FIGS. 4 and 5, the
`total number of spare blocks in the spare module is the same
`as the number of blocks in each module. However, a spare
`module with smaller number of spare blocks can be employed
`for saving costs. Th