Isi
UniStat
unicode block statistics of general character categories, decomposition and uppercase mappings based on Blocks.txt and UnicodeData.txt http://unicode.org/Public/UNIDATA/

decomposition mappings
see http://unicode.org/Public/UNIDATA/UCD.html#Character_Decomposition_Mappings
(canonical)	1926
circle	229
compat	673
final	240
font	1038
fraction	16
initial	171
isolated	238
medial	82
narrow	122
noBreak	5
small	26
square	205
sub	24
super	100
vertical	25
wide	104



major category/block table
Categories are letter, mark, numeric, punctuation, symbol, separator and other. Additional columns give number of characters which have an uppercase and canonical decomposition mapping, resp. Final columns give begin and end, block length and name.
Let	Mar	Num	Pun	Sym	Sep	Oth	upC	deC	beg	end	len	block
52	0	10	23	9	1	33	26	0	0000	007F	128	Basic Latin
65	0	6	5	18	1	33	32	53	0080	00FF	128	Latin-1 Supplement
128	0	0	0	0	0	0	63	108	0100	017F	128	Latin Extended-A
183	0	0	0	0	0	25	75	91	0180	024F	208	Latin Extended-B
96	0	0	0	0	0	0	19	0	0250	02AF	96	IPA Extensions
36	0	0	0	44	0	0	0	0	02B0	02FF	80	Spacing Modifier Letters
0	107	0	0	0	0	5	1	4	0300	036F	112	Combining Diacritical Marks
113	0	0	2	5	0	24	56	26	0370	03FF	144	Greek and Coptic
239	6	0	0	1	0	10	119	52	0400	04FF	256	Cyrillic
16	0	0	0	0	0	32	8	0	0500	052F	48	Cyrillic Supplementary
78	0	0	8	0	0	10	38	0	0530	058F	96	Armenian
30	47	0	5	0	0	30	0	0	0590	05FF	112	Hebrew
147	41	20	9	5	0	34	0	8	0600	06FF	256	Arabic
34	28	0	14	0	0	4	0	0	0700	074F	80	Syriac
39	11	0	0	0	0	14	0	0	0780	07BF	64	Thaana
66	26	10	3	0	0	23	0	11	0900	097F	128	Devanagari
52	19	16	0	3	0	38	0	5	0980	09FF	128	Bengali
51	16	10	0	0	0	51	0	6	0A00	0A7F	128	Gurmukhi
52	20	10	0	1	0	45	0	0	0A80	0AFF	128	Gujarati
53	17	10	0	1	0	47	0	5	0B00	0B7F	128	Oriya
35	14	12	0	8	0	59	0	4	0B80	0BFF	128	Tamil
51	19	10	0	0	0	48	0	1	0C00	0C7F	128	Telugu
53	19	10	0	0	0	46	0	5	0C80	0CFF	128	Kannada
52	16	10	0	0	0	50	0	3	0D00	0D7F	128	Malayalam
59	20	0	1	0	0	48	0	4	0D80	0DFF	128	Sinhala
57	16	10	3	1	0	41	0	0	0E00	0E7F	128	Thai
40	15	10	0	0	0	63	0	0	0E80	0EFF	128	Lao
47	74	20	20	32	0	63	0	17	0F00	0FFF	256	Tibetan
47	15	10	6	0	0	82	0	1	1000	109F	160	Myanmar
79	0	0	1	0	0	16	0	0	10A0	10FF	96	Georgian
240	0	0	0	0	0	16	0	0	1100	11FF	256	Hangul Jamo
317	0	20	8	0	0	39	0	0	1200	137F	384	Ethiopic
85	0	0	0	0	0	11	0	0	13A0	13FF	96	Cherokee
628	0	0	2	0	0	10	0	0	1400	167F	640	Unified Canadian Aboriginal Syllabics
26	0	0	2	0	1	3	0	0	1680	169F	32	Ogham
75	0	3	3	0	0	15	0	0	16A0	16FF	96	Runic
17	3	0	0	0	0	12	0	0	1700	171F	32	Tagalog
18	3	0	2	0	0	9	0	0	1720	173F	32	Hanunoo
18	2	0	0	0	0	12	0	0	1740	175F	32	Buhid
16	2	0	0	0	0	14	0	0	1760	177F	32	Tagbanwa
54	31	20	6	1	0	16	0	0	1780	17FF	128	Khmer
129	4	10	11	0	1	21	0	0	1800	18AF	176	Mongolian
29	24	10	2	1	0	14	0	0	1900	194F	80	Limbu
35	0	0	0	0	0	13	0	0	1950	197F	48	Tai Le
0	0	0	0	32	0	0	0	0	19E0	19FF	32	Khmer Symbols
108	0	0	0	0	0	20	0	0	1D00	1D7F	128	Phonetic Extensions
246	0	0	0	0	0	10	121	245	1E00	1EFF	256	Latin Extended Additional
218	0	0	0	15	0	23	97	229	1F00	1FFF	256	Greek Extended
0	0	0	60	2	16	34	0	2	2000	206F	112	General Punctuation
2	0	17	4	6	0	19	0	0	2070	209F	48	Superscripts and Subscripts
0	0	0	0	18	0	30	0	0	20A0	20CF	48	Currency Symbols
0	27	0	0	0	0	21	0	0	20D0	20FF	48	Combining Diacritical Marks for Symbols
43	0	0	0	32	0	5	0	3	2100	214F	80	Letterlike Symbols
0	0	49	0	0	0	15	16	0	2150	218F	64	Number Forms
0	0	0	0	112	0	0	0	6	2190	21FF	112	Arrows
0	0	0	0	256	0	0	0	38	2200	22FF	256	Mathematical Operators
0	0	0	5	204	0	47	0	2	2300	23FF	256	Miscellaneous Technical
0	0	0	0	39	0	25	0	0	2400	243F	64	Control Pictures
0	0	0	0	11	0	21	0	0	2440	245F	32	Optical Character Recognition
0	0	82	0	78	0	0	26	0	2460	24FF	160	Enclosed Alphanumerics
0	0	0	0	128	0	0	0	0	2500	257F	128	Box Drawing
0	0	0	0	32	0	0	0	0	2580	259F	32	Block Elements
0	0	0	0	96	0	0	0	0	25A0	25FF	96	Geometric Shapes
0	0	0	0	145	0	111	0	0	2600	26FF	256	Miscellaneous Symbols
0	0	30	14	130	0	18	0	0	2700	27BF	192	Dingbats
0	0	0	6	22	0	20	0	0	27C0	27EF	48	Miscellaneous Mathematical Symbols-A
0	0	0	0	16	0	0	0	0	27F0	27FF	16	Supplemental Arrows-A
0	0	0	0	256	0	0	0	0	2800	28FF	256	Braille Patterns
0	0	0	0	128	0	0	0	0	2900	297F	128	Supplemental Arrows-B
0	0	0	28	100	0	0	0	0	2980	29FF	128	Miscellaneous Mathematical Symbols-B
0	0	0	0	256	0	0	0	1	2A00	2AFF	256	Supplemental Mathematical Operators
0	0	0	0	14	0	242	0	0	2B00	2BFF	256	Miscellaneous Symbols and Arrows
0	0	0	0	115	0	13	0	0	2E80	2EFF	128	CJK Radicals Supplement
0	0	0	0	214	0	10	0	0	2F00	2FDF	224	Kangxi Radicals
0	0	0	0	12	0	4	0	0	2FF0	2FFF	16	Ideographic Description Characters
9	6	13	27	8	1	0	0	0	3000	303F	64	CJK Symbols and Punctuation
89	2	0	0	2	0	3	0	27	3040	309F	96	Hiragana
94	0	0	2	0	0	0	0	31	30A0	30FF	96	Katakana
40	0	0	0	0	0	8	0	0	3100	312F	48	Bopomofo
94	0	0	0	0	0	2	0	0	3130	318F	96	Hangul Compatibility Jamo
0	0	4	0	12	0	0	0	0	3190	319F	16	Kanbun
24	0	0	0	0	0	8	0	0	31A0	31BF	32	Bopomofo Extended
16	0	0	0	0	0	0	0	0	31F0	31FF	16	Katakana Phonetic Extensions
0	0	50	0	191	0	15	0	0	3200	32FF	256	Enclosed CJK Letters and Months
0	0	0	0	256	0	0	0	0	3300	33FF	256	CJK Compatibility
6582	0	0	0	0	0	10	0	0	3400	4DBF	6592	CJK Unified Ideographs Extension A
0	0	0	0	64	0	0	0	0	4DC0	4DFF	64	Yijing Hexagram Symbols
20902	0	0	0	0	0	90	0	0	4E00	9FFF	20992	CJK Unified Ideographs
1165	0	0	0	0	0	3	0	0	A000	A48F	1168	Yi Syllables
0	0	0	0	55	0	9	0	0	A490	A4CF	64	Yi Radicals
11172	0	0	0	0	0	12	0	0	AC00	D7AF	11184	Hangul Syllables
0	0	0	0	0	0	896	0	0	D800	DB7F	896	High Surrogates
0	0	0	0	0	0	128	0	0	DB80	DBFF	128	High Private Use Surrogates
0	0	0	0	0	0	1024	0	0	DC00	DFFF	1024	Low Surrogates
0	0	0	0	0	0	6400	0	0	E000	F8FF	6400	Private Use Area
361	0	0	0	0	0	151	0	349	F900	FAFF	512	CJK Compatibility Ideographs
56	1	0	0	1	0	22	0	34	FB00	FB4F	80	Alphabetic Presentation Forms
591	0	0	2	2	0	93	0	0	FB50	FDFF	688	Arabic Presentation Forms-A
0	16	0	0	0	0	0	0	0	FE00	FE0F	16	Variation Selectors
0	4	0	0	0	0	12	0	0	FE20	FE2F	16	Combining Half Marks
0	0	0	32	0	0	0	0	0	FE30	FE4F	32	CJK Compatibility Forms
0	0	0	21	5	0	6	0	0	FE50	FE6F	32	Small Form Variants
140	0	0	0	0	0	4	0	0	FE70	FEFF	144	Arabic Presentation Forms-B
162	0	10	30	23	0	15	26	0	FF00	FFEF	240	Halfwidth and Fullwidth Forms
0	0	0	0	2	0	14	0	0	FFF0	FFFF	16	Specials
88	0	0	0	0	0	40	0	0	10000	1007F	128	Linear B Syllabary
123	0	0	0	0	0	5	0	0	10080	100FF	128	Linear B Ideograms
0	0	45	2	10	0	7	0	0	10100	1013F	64	Aegean Numbers
31	0	4	0	0	0	13	0	0	10300	1032F	48	Old Italic
26	0	1	0	0	0	5	0	0	10330	1034F	32	Gothic
30	0	0	1	0	0	1	0	0	10380	1039F	32	Ugaritic
80	0	0	0	0	0	0	40	0	10400	1044F	80	Deseret
48	0	0	0	0	0	0	0	0	10450	1047F	48	Shavian
30	0	10	0	0	0	8	0	0	10480	104AF	48	Osmanya
55	0	0	0	0	0	9	0	0	10800	1083F	64	Cypriot Syllabary
0	0	0	0	246	0	10	0	0	1D000	1D0FF	256	Byzantine Musical Symbols
0	30	0	0	181	0	45	0	13	1D100	1D1FF	256	Musical Symbols
0	0	0	0	87	0	9	0	0	1D300	1D35F	96	Tai Xuan Jing Symbols
932	0	50	0	10	0	32	0	0	1D400	1D7FF	1024	Mathematical Alphanumeric Symbols
42711	0	0	0	0	0	9	0	0	20000	2A6DF	42720	CJK Unified Ideographs Extension B
542	0	0	0	0	0	2	0	542	2F800	2FA1F	544	CJK Compatibility Ideographs Supplement
0	0	0	0	0	0	128	0	0	E0000	E007F	128	Tags
0	240	0	0	0	0	0	0	0	E0100	E01EF	240	Variation Selectors Supplement
0	0	0	0	0	0	65536	0	0	F0000	FFFFF	65536	Supplementary Private Use Area-A
0	0	0	0	0	0	65536	0	0	100000	10FFFF	65536	Supplementary Private Use Area-B
90547	941	612	370	3754	21	142187	763	1926	
BMP: nonblock 4112 unassigned 2247

detailled block stats
see http://unicode.org/Public/UNIDATA/UCD.html#General_Category_Values
Basic Latin	b0000	l128	Lu26	Ll26	Nd10	Zs1	Cc33	Pc1	Pd1	Ps3	Pe3	Po15	Sm6	Sc1	Sk2	
Latin-1 Supplement	b0080	l128	Lu30	Ll35	No6	Zs1	Cc32	Cf1	Pi1	Pf1	Po3	Sm4	Sc4	Sk4	So6	
Latin Extended-A	b0100	l128	Lu63	Ll65	
Latin Extended-B	b0180	l208	Cn25	Lu90	Ll84	Lt4	Lo5	
IPA Extensions	b0250	l96	Ll96	
Spacing Modifier Letters	b02B0	l80	Lm36	Sk44	
Combining Diacritical Marks	b0300	l112	Cn5	Mn107	
Greek and Coptic	b0370	l144	Cn24	Lu52	Ll60	Lm1	Po2	Sm1	Sk4	
Cyrillic	b0400	l256	Cn10	Lu120	Ll119	Mn4	Me2	So1	
Cyrillic Supplementary	b0500	l48	Cn32	Lu8	Ll8	
Armenian	b0530	l96	Cn10	Lu38	Ll39	Lm1	Pd1	Po7	
Hebrew	b0590	l112	Cn30	Lo30	Mn47	Po5	
Arabic	b0600	l256	Cn29	Lm3	Lo144	Mn40	Me1	Nd20	Cf5	Po9	So5	
Syriac	b0700	l80	Cn3	Lo34	Mn28	Cf1	Po14	
Thaana	b0780	l64	Cn14	Lo39	Mn11	
Devanagari	b0900	l128	Cn23	Lo66	Mn18	Mc8	Nd10	Po3	
Bengali	b0980	l128	Cn38	Lo52	Mn9	Mc10	Nd10	No6	Sc2	So1	
Gurmukhi	b0A00	l128	Cn51	Lo51	Mn12	Mc4	Nd10	
Gujarati	b0A80	l128	Cn45	Lo52	Mn13	Mc7	Nd10	Sc1	
Oriya	b0B00	l128	Cn47	Lo53	Mn8	Mc9	Nd10	So1	
Tamil	b0B80	l128	Cn59	Lo35	Mn3	Mc11	Nd9	No3	Sc1	So7	
Telugu	b0C00	l128	Cn48	Lo51	Mn12	Mc7	Nd10	
Kannada	b0C80	l128	Cn46	Lo53	Mn5	Mc14	Nd10	
Malayalam	b0D00	l128	Cn50	Lo52	Mn4	Mc12	Nd10	
Sinhala	b0D80	l128	Cn48	Lo59	Mn5	Mc15	Po1	
Thai	b0E00	l128	Cn41	Lm1	Lo56	Mn16	Nd10	Po3	Sc1	
Lao	b0E80	l128	Cn63	Lm1	Lo39	Mn15	Nd10	
Tibetan	b0F00	l256	Cn63	Lo47	Mn71	Mc3	Nd10	No10	Ps2	Pe2	Po16	So32	
Myanmar	b1000	l160	Cn82	Lo47	Mn10	Mc5	Nd10	Po6	
Georgian	b10A0	l96	Cn16	Lu38	Lo41	Po1	
Hangul Jamo	b1100	l256	Cn16	Lo240	
Ethiopic	b1200	l384	Cn39	Lo317	Nd9	No11	Po8	
Cherokee	b13A0	l96	Cn11	Lo85	
Unified Canadian Aboriginal Syllabics	b1400	l640	Cn10	Lo628	Po2	
Ogham	b1680	l32	Cn3	Lo26	Zs1	Ps1	Pe1	
Runic	b16A0	l96	Cn15	Lo75	Nl3	Po3	
Tagalog	b1700	l32	Cn12	Lo17	Mn3	
Hanunoo	b1720	l32	Cn9	Lo18	Mn3	Po2	
Buhid	b1740	l32	Cn12	Lo18	Mn2	
Tagbanwa	b1760	l32	Cn14	Lo16	Mn2	
Khmer	b1780	l128	Cn14	Lm1	Lo53	Mn20	Mc11	Nd10	No10	Cf2	Po6	Sc1	
Mongolian	b1800	l176	Cn21	Lm1	Lo128	Mn4	Nd10	Zs1	Pd1	Po10	
Limbu	b1900	l80	Cn14	Lo29	Mn9	Mc15	Nd10	Po2	So1	
Tai Le	b1950	l48	Cn13	Lo35	
Khmer Symbols	b19E0	l32	So32	
Phonetic Extensions	b1D00	l128	Cn20	Ll54	Lm54	
Latin Extended Additional	b1E00	l256	Cn10	Lu120	Ll126	
Greek Extended	b1F00	l256	Cn23	Lu69	Ll122	Lt27	Sk15	
General Punctuation	b2000	l112	Cn15	Zs14	Zl1	Zp1	Cf19	Pc3	Pd6	Ps3	Pe1	Pi5	Pf3	Po39	Sm2	
Superscripts and Subscripts	b2070	l48	Cn19	Ll2	No17	Ps2	Pe2	Sm6	
Currency Symbols	b20A0	l48	Cn30	Sc18	
Combining Diacritical Marks for Symbols	b20D0	l48	Cn21	Mn20	Me7	
Letterlike Symbols	b2100	l80	Cn5	Lu27	Ll12	Lo4	Sm6	So26	
Number Forms	b2150	l64	Cn15	Nl36	No13	
Arrows	b2190	l112	Sm27	So85	
Mathematical Operators	b2200	l256	Sm256	
Miscellaneous Technical	b2300	l256	Cn47	Ps2	Pe2	Po1	Sm32	So172	
Control Pictures	b2400	l64	Cn25	So39	
Optical Character Recognition	b2440	l32	Cn21	So11	
Enclosed Alphanumerics	b2460	l160	No82	So78	
Box Drawing	b2500	l128	So128	
Block Elements	b2580	l32	So32	
Geometric Shapes	b25A0	l96	Sm10	So86	
Miscellaneous Symbols	b2600	l256	Cn111	Sm1	So144	
Dingbats	b2700	l192	Cn18	No30	Ps7	Pe7	So130	
Miscellaneous Mathematical Symbols-A	b27C0	l48	Cn20	Ps3	Pe3	Sm22	
Supplemental Arrows-A	b27F0	l16	Sm16	
Braille Patterns	b2800	l256	So256	
Supplemental Arrows-B	b2900	l128	Sm128	
Miscellaneous Mathematical Symbols-B	b2980	l128	Ps14	Pe14	Sm100	
Supplemental Mathematical Operators	b2A00	l256	Sm256	
Miscellaneous Symbols and Arrows	b2B00	l256	Cn242	So14	
CJK Radicals Supplement	b2E80	l128	Cn13	So115	
Kangxi Radicals	b2F00	l224	Cn10	So214	
Ideographic Description Characters	b2FF0	l16	Cn4	So12	
CJK Symbols and Punctuation	b3000	l64	Lm7	Lo2	Mn6	Nl13	Zs1	Pd2	Ps10	Pe11	Po4	So8	
Hiragana	b3040	l96	Cn3	Lm2	Lo87	Mn2	Sk2	
Katakana	b30A0	l96	Lm3	Lo91	Pc1	Pd1	
Bopomofo	b3100	l48	Cn8	Lo40	
Hangul Compatibility Jamo	b3130	l96	Cn2	Lo94	
Kanbun	b3190	l16	No4	So12	
Bopomofo Extended	b31A0	l32	Cn8	Lo24	
Katakana Phonetic Extensions	b31F0	l16	Lo16	
Enclosed CJK Letters and Months	b3200	l256	Cn15	No50	So191	
CJK Compatibility	b3300	l256	So256	
CJK Unified Ideographs Extension A	b3400	l6592	Cn10	Lo6582	
Yijing Hexagram Symbols	b4DC0	l64	So64	
CJK Unified Ideographs	b4E00	l20992	Cn90	Lo20902	
Yi Syllables	bA000	l1168	Cn3	Lo1165	
Yi Radicals	bA490	l64	Cn9	So55	
Hangul Syllables	bAC00	l11184	Cn12	Lo11172	
High Surrogates	bD800	l896	Cs896	
High Private Use Surrogates	bDB80	l128	Cs128	
Low Surrogates	bDC00	l1024	Cs1024	
Private Use Area	bE000	l6400	Co6400	
CJK Compatibility Ideographs	bF900	l512	Cn151	Lo361	
Alphabetic Presentation Forms	bFB00	l80	Cn22	Ll12	Lo44	Mn1	Sm1	
Arabic Presentation Forms-A	bFB50	l688	Cn93	Lo591	Ps1	Pe1	Sc1	So1	
Variation Selectors	bFE00	l16	Mn16	
Combining Half Marks	bFE20	l16	Cn12	Mn4	
CJK Compatibility Forms	bFE30	l32	Pc5	Pd2	Ps9	Pe9	Po7	
Small Form Variants	bFE50	l32	Cn6	Pd2	Ps3	Pe3	Po13	Sm4	Sc1	
Arabic Presentation Forms-B	bFE70	l144	Cn3	Lo140	Cf1	
Halfwidth and Fullwidth Forms	bFF00	l240	Cn15	Lu26	Ll26	Lm3	Lo107	Nd10	Pc2	Pd1	Ps5	Pe5	Po17	Sm11	Sc5	Sk3	So4	
Specials	bFFF0	l16	Cn11	Cf3	So2	
Linear B Syllabary	b10000	l128	Cn40	Lo88	
Linear B Ideograms	b10080	l128	Cn5	Lo123	
Aegean Numbers	b10100	l64	Cn7	No45	Po2	So10	
Old Italic	b10300	l48	Cn13	Lo31	No4	
Gothic	b10330	l32	Cn5	Lo26	Nl1	
Ugaritic	b10380	l32	Cn1	Lo30	Po1	
Deseret	b10400	l80	Lu40	Ll40	
Shavian	b10450	l48	Lo48	
Osmanya	b10480	l48	Cn8	Lo30	Nd10	
Cypriot Syllabary	b10800	l64	Cn9	Lo55	
Byzantine Musical Symbols	b1D000	l256	Cn10	So246	
Musical Symbols	b1D100	l256	Cn37	Mn22	Mc8	Cf8	So181	
Tai Xuan Jing Symbols	b1D300	l96	Cn9	So87	
Mathematical Alphanumeric Symbols	b1D400	l1024	Cn32	Lu443	Ll489	Nd50	Sm10	
CJK Unified Ideographs Extension B	b20000	l42720	Cn9	Lo42711	
CJK Compatibility Ideographs Supplement	b2F800	l544	Cn2	Lo542	
Tags	bE0000	l128	Cn31	Cf97	
Variation Selectors Supplement	bE0100	l240	Mn240	
Supplementary Private Use Area-A	bF0000	l65536	Cn2	Co65534	
Supplementary Private Use Area-B	b100000	l65536	Cn2	Co65534