181 lines
4.7 KiB
Plaintext
181 lines
4.7 KiB
Plaintext
|
* Sorting weight table for European Extended Latin (A), default order.
|
||
|
* This covers Unicode block 5 LATIN EXTENDED-A (0100 - 017F)
|
||
|
*
|
||
|
* The table is intended to be loaded after 'LATIN1-DEFAULT', which
|
||
|
* covers blocks 2 & 4. Note that all these block 5 characters are
|
||
|
* forced into block 4, so that all alphabetic characters in blocks 2,
|
||
|
* 4 & 5 sort together. This means that the control characters (blocks 1 and 3)
|
||
|
* sort before all of these, in Unicode value order.
|
||
|
*
|
||
|
* SYNTAX:
|
||
|
* Each non-comment line gives one or more weights for a character,
|
||
|
* as follows (character value in hex, weights in decimal):
|
||
|
* Field 1 = Unicode character value
|
||
|
* Field 2 = Shared weight (characters that sort together if accents
|
||
|
* and case were to be disregarded should have the same SW)
|
||
|
* Or, Block Weight/Shared Weight. This form allows characters
|
||
|
* in different Unicode blocks to have equal SWs. If BW is
|
||
|
* omitted, only SWs for characters in the same block are equal.
|
||
|
* Field 3 = Accent weight, or '-' to omit or copy from previous.
|
||
|
* Please use values as defined in the file NLS.WT.LOOKUP.
|
||
|
* Field 4 = Case weight, or 'U' for upper and 'L' for lower case chars.
|
||
|
*
|
||
|
* HEX (BW/)SW AW CW
|
||
|
* A-related
|
||
|
0102 4/1000 5 U * A WITH BREVE
|
||
|
0103 - 5 L
|
||
|
0104 - 44 U * A WITH OGONEK
|
||
|
0105 - 44 L
|
||
|
0100 - 46 U * A WITH MACRON
|
||
|
0101 - 46 L
|
||
|
* C-related
|
||
|
0106 4/1030 1 U * C WITH ACUTE
|
||
|
0107 - 1 L
|
||
|
0108 - 13 U * C WITH CIRCUMFLEX
|
||
|
0109 - 13 L
|
||
|
010C - 19 U * C WITH CARON
|
||
|
010D - 19 L
|
||
|
010A - 35 U * C WITH DOT ABOVE
|
||
|
010B - 35 L
|
||
|
* D-related
|
||
|
010E 4/1040 19 U * D WITH CARON
|
||
|
010F - 19 L
|
||
|
0110 - 38 U * D WITH STROKE
|
||
|
0111 - 38 L
|
||
|
* E-related
|
||
|
0114 4/1060 5 U * E WITH BREVE
|
||
|
0115 - 5 L
|
||
|
011A - 19 U * E WITH CARON
|
||
|
011B - 19 L
|
||
|
0116 - 35 U * E WITH DOT ABOVE
|
||
|
0117 - 35 L
|
||
|
0118 - 44 U * E WITH OGONEK
|
||
|
0119 - 44 L
|
||
|
0112 - 46 U * E WITH MACRON
|
||
|
0113 - 46 L
|
||
|
* G-related
|
||
|
011E 4/1090 5 U * G WITH BREVE
|
||
|
011F - 5 L
|
||
|
011C - 13 U * G WITH CIRCUMFLEX
|
||
|
011D - 13 L
|
||
|
0120 - 35 U * G WITH DOT ABOVE
|
||
|
0121 - 35 L
|
||
|
0122 - 40 U * G WITH CEDILLA
|
||
|
0123 - 40 L
|
||
|
* H-related
|
||
|
0124 4/1100 13 U * H WITH CIRCUMFLEX
|
||
|
0125 - 13 L
|
||
|
0126 - 38 U * H WITH STROKE
|
||
|
0127 - 38 L
|
||
|
* I-related
|
||
|
012C 4/1110 5 U * I WITH BREVE
|
||
|
012D - 5 L
|
||
|
0128 - 31 U * I WITH TILDE
|
||
|
0129 - 31 L
|
||
|
0130 - 35 U * I WITH DOT ABOVE
|
||
|
012E - 44 U * I WITH OGONEK
|
||
|
012F - 44 L
|
||
|
012A - 46 U * I WITH MACRON
|
||
|
012B - 46 L
|
||
|
* Dotless lowercase I, comes after i
|
||
|
0131 4/1117 - L * DOTLESS I
|
||
|
* IJ ligature, comes after I
|
||
|
0132 4/1119 - U * LIGATURE IJ
|
||
|
0133 - - L
|
||
|
*J-related
|
||
|
0134 4/1120 13 U * J WITH CIRCUMFLEX
|
||
|
0135 - 13 L
|
||
|
* K-related
|
||
|
0136 4/1130 40 U * K WITH CEDILLA
|
||
|
0137 - 40 L
|
||
|
* Letter KRA, comes after K
|
||
|
0138 4/1137 - L * KRA
|
||
|
* L-related
|
||
|
0139 4/1140 1 U * L WITH ACUTE
|
||
|
013A - 1 L
|
||
|
013D - 19 U * L WITH CARON
|
||
|
013E - 19 L
|
||
|
013B - 40 U * L WITH CEDILLA
|
||
|
013C - 40 L
|
||
|
* L with middle dot, comes after L
|
||
|
013F 4/1145 - U * L WITH MIDDLE DOT
|
||
|
0140 - - L
|
||
|
* L with slash, comes after L
|
||
|
0141 4/1147 38 U * L WITH STROKE
|
||
|
0142 - 38 L
|
||
|
* N-related
|
||
|
0143 4/1160 1 U * N WITH ACUTE
|
||
|
0144 - 1 L
|
||
|
0147 - 19 U * N WITH CARON
|
||
|
0148 - 19 L
|
||
|
0145 - 40 U * N WITH CEDILLA
|
||
|
0146 - 40 L
|
||
|
* n preceded by apostrophe, comes after n
|
||
|
0149 4/1165 - L * N PRECEDED BY APOSTROPHE
|
||
|
* ENG, comes after N
|
||
|
014A 4/1167 - U * ENG
|
||
|
014B - - L
|
||
|
* O-related
|
||
|
014E 4/1170 5 U * O WITH BREVE
|
||
|
014F - 5 L
|
||
|
0150 - 29 U * O WITH DOUBLE ACUTE
|
||
|
0151 - 29 L
|
||
|
014C - 46 U * O WITH MACRON
|
||
|
014D - 46 L
|
||
|
* OE ligature, comes after O
|
||
|
0152 4/1175 - U * LIGATURE OE
|
||
|
0153 - - L
|
||
|
* R-related
|
||
|
0154 4/1210 1 U * R WITH ACUTE
|
||
|
0155 - 1 L
|
||
|
0158 - 19 U * R WITH CARON
|
||
|
0159 - 19 L
|
||
|
0156 - 40 U * R WITH CEDILLA
|
||
|
0157 - 40 L
|
||
|
* S-related
|
||
|
015A 4/1220 1 U * S WITH ACUTE
|
||
|
015B - 1 L
|
||
|
015C - 13 U * S WITH CIRCUMFLEX
|
||
|
015D - 13 L
|
||
|
0160 - 19 U * S WITH CARON
|
||
|
0161 - 19 L
|
||
|
015E - 40 U * S WITH CEDILLA
|
||
|
015F - 40 L
|
||
|
* Letter long s, comes after s
|
||
|
017F 4/1224 - L * LONG S
|
||
|
* T-related
|
||
|
0164 4/1240 19 U * T WITH CARON
|
||
|
0165 - 19 L
|
||
|
0166 - 38 U * T WITH STROKE
|
||
|
0167 - 38 L
|
||
|
0162 - 40 U * T WITH CEDILLA
|
||
|
0163 - 40 L
|
||
|
* U-related
|
||
|
016C 4/1250 5 U * U WITH BREVE
|
||
|
016D - 5 L
|
||
|
016E - 21 U * U WITH RING ABOVE
|
||
|
016F - 21 L
|
||
|
0170 - 29 U * U WITH DOUBLE ACUTE
|
||
|
0171 - 29 L
|
||
|
0168 - 31 U * U WITH TILDE
|
||
|
0169 - 31 L
|
||
|
0172 - 44 U * U WITH OGONEK
|
||
|
0173 - 44 L
|
||
|
016A - 46 U * U WITH MACRON
|
||
|
016B - 46 L
|
||
|
* W-related
|
||
|
0174 4/1270 13 U * W WITH CIRCUMFLEX
|
||
|
0175 - 13 L
|
||
|
* Y-related
|
||
|
0176 4/1290 13 U * Y WITH CIRCUMFLEX
|
||
|
0177 - 13 L
|
||
|
0178 - 24 U * Y WITH DIAERESIS
|
||
|
* Z-related
|
||
|
0179 4/1300 1 U * Z WITH ACUTE
|
||
|
017A - 1 L
|
||
|
017D - 19 U * Z WITH CARON
|
||
|
017E - 19 L
|
||
|
017B - 35 U * Z WITH DOT ABOVE
|
||
|
017C - 35 L
|
||
|
* END
|