Skip to content
This repository was archived by the owner on May 1, 2022. It is now read-only.

Commit 2ddb514

Browse files
author
lmb
committed
Add several string types.
Namely VISIBLESTRING and T61STRING. Both string encodings are only supported in a very limited fashion, see docs/T61String.md for a discussion of what would be required. Safe comparison, etc. has not been implemented yet.
1 parent 0595e84 commit 2ddb514

File tree

7 files changed

+262
-63
lines changed

7 files changed

+262
-63
lines changed

docs/T61String.md

+145
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
This file is Copyright (c) 2003, 2006 Lev Walkin <[email protected]>. All rights
2+
reserved. Redistribution and modifications are permitted subject to BSD license.
3+
4+
Originally part of the asn1c source code, file TeletexString.c. -- Lorenz Bauer
5+
6+
Here is a formal attempt at creating a mapping from TeletexString
7+
(T61String) of the latest ASN.1 standard (X.680:2002) into the Unicode
8+
character set. -- Lev Walkin <[email protected]>
9+
10+
The first thing to keep in mind is that TeletexString (T61String)
11+
is defined in ASN.1, and is not really a T.61 string.
12+
The T.61 standard is withdrawn by ITU-T and is no longer an authoritative
13+
reference. See http://www.itu.int/rec/T-REC-T.61
14+
15+
The X.680 specifies TeletexString (T61String) as a combination of the
16+
character sets specified by the registration numbers listed in
17+
ISO International Register of Coded Character Sets to be used with
18+
Escape Sequences (ISO-2375):
19+
6, 87, 102, 103, 106, 107, 126, 144, 150, 153, 156, 164, 165, 168,
20+
plus SPACE and DELETE characters.
21+
In addition to that, the X.680 Table 6 NOTE 2 allows using register entries
22+
6 and 156 instead of 102 and 103.
23+
24+
The ISO Register itself is available at http://www.itscj.ipsj.or.jp/ISO-IR/
25+
26+
#6 is ASCII. http://www.itscj.ipsj.or.jp/ISO-IR/006.pdf
27+
Escapes into:
28+
G0: ESC 2/8 4/2 ("(B")
29+
G1: ESC 2/9 4/2 (")B")
30+
The range is [0x21 .. 0x7e]. Conversion into Unicode
31+
is simple, because it has one-to-one correspondence.
32+
#87 is a "Japanese Graphic Character Set for Information Interchange".
33+
Is a multiple-byte set of 6877 characters.
34+
The character set is JIS X 0208-1983 (originally JIS C 6226-1983).
35+
Escapes into:
36+
G0: ESC 2/4 4/2 ("$B")
37+
G1: ESC 2/4 2/9 4/2 ("$)B")
38+
G2: ESC 2/4 2/10 4/2 ("$*B")
39+
G3: ESC 2/4 2/11 4/2 ("$+B")
40+
#102 is "Teletex Primary Set of Graphic Characters" and is almost ASCII.
41+
Escapes into:
42+
G0: ESC 2/8 7/5 ("(u")
43+
G1: ESC 2/9 7/5 (")u")
44+
G2: ESC 2/10 7/5 ("*u")
45+
G3: ESC 2/11 7/5 ("+u")
46+
It is almost identical to ASCII, except for ASCII position for '$'
47+
(DOLLAR SIGN) is filled with '¤' (CURRENCY SIGN), which is U+00A4.
48+
Also, ASCII positions for '`', '\', '^', '{', '}', '~' are marked
49+
as "should not be used".
50+
#103 is a supplementary set of characters used in combination with #102.
51+
Escapes into:
52+
G0: ESC 2/8 7/6 ("(v")
53+
G1: ESC 2/9 7/6 (")v")
54+
G2: ESC 2/10 7/6 ("*v")
55+
G3: ESC 2/11 7/6 ("+v")
56+
Some characters in that character set are combining characters,
57+
which can only be restrictively used with certain basic Latin letters.
58+
It can be thought of as a subset of #156 with the exception of 4/12
59+
which is UNDERLINE in #103 and absent in #156.
60+
#106 is a primary set of control functions, used in combination with #107.
61+
Escapes into:
62+
C0: ESC 2/1 4/5 ("!E")
63+
This set is so short I can list it here:
64+
0x08 BS BACKSPACE -- same as Unicode
65+
0x0a LF LINE FEED -- same as Unicode
66+
0x0c FF FORM FEED -- same as Unicode
67+
0x0d CR CARRIAGE RETURN -- same as Unicode
68+
0x0e LS1 LOCKING SHIFT ONE
69+
0x0f LS0 LOCKING SHIFT ZERO
70+
0x19 SS2 SINGLE SHIFT TWO
71+
0x1a SUB SUBSTITUTE CHARACTER
72+
0x1b ESC ESCAPE -- same as Unicode
73+
0x1d SS3 SINGLE SHIFT THREE
74+
The LS1 and LS0 are two magical functions which, respectively, invoke
75+
the currently designated G1 or G0 set into positions 2/1 to 7/14
76+
The SS2 and SS3, respectively, invoke one character of the
77+
currently designated set G2 and G3.
78+
The SUB is wholly equivalent to U+001a (SUBSTITUTE)
79+
#107 is a supplementary set of control functions, used with #106.
80+
Escapes into:
81+
C1: ESC 2/2 4/8 ('"H')
82+
This set contains three special control codes:
83+
0x8b PLD PARTIAL LINE DOWN -- similar to <SUB>
84+
0x8c PLU PARTIAL LINE UP -- sumilar to <SUP>
85+
0x9b CSI CONTROL SEQUENCE INTRODUCER
86+
This set is so out of world we can probably safely ignore it.
87+
#126 is a "Right-hand Part of the Latin/Greek Alphabet".
88+
Comprises of 90 characters, including accented letters.
89+
Escapes into:
90+
G1: ESC 2/13 4/6 ("-F")
91+
G2: ESC 2/14 4/6 (".F")
92+
G3: ESC 2/15 4/6 ("/F")
93+
Note: This Registration is a subset of ISO-IR 227.
94+
#144 is a "Cyrillic part of the Latin/Cyrillic Alphabet".
95+
Comprises of 95 characters.
96+
Escapes into:
97+
G1: ESC 2/13 4/12 ("-L")
98+
G2: ESC 2/14 4/12 (".L")
99+
G3: ESC 2/15 4/12 ("/L")
100+
#150 is a "Greek Primary Set of Graphic Characters".
101+
Comprises of 94 characters.
102+
Escapes into:
103+
G0: ESC 2/8 2/1 4/0 ("(!@")
104+
G1: ESC 2/9 2/1 4/0 (")!@")
105+
G2: ESC 2/10 2/1 4/0 ("*!@")
106+
G3: ESC 2/11 2/1 4/0 ("+!@")
107+
#153 is a "Basic Cyrillic Character Set for 8-bit codes".
108+
Comprises of 68 characters.
109+
Escapes into:
110+
G1: ESC 2/13 4/15 ("-O")
111+
G2: ESC 2/14 4/15 (".O")
112+
G3: ESC 2/15 4/15 ("/O")
113+
#156 is a "Supplementary Set of ISO/IEC 6937:1992" for use with #6
114+
Comprises of 87 characters.
115+
Escapes into:
116+
G1: ESC 2/13 5/2 ("-R")
117+
G2: ESC 2/14 5/2 (".R")
118+
G3: ESC 2/15 5/2 ("/R")
119+
#164 is a "Hebrew Supplementary Set of Graphic Characters"
120+
Comprises of 27 characters.
121+
Escapes into:
122+
G1: ESC 2/13 5/3 ("-S")
123+
G2: ESC 2/14 5/3 (".S")
124+
G3: ESC 2/15 5/3 ("/S")
125+
#165 is a set of "Codes of the Chinese graphic character set"
126+
Is a multiple-byte set of 8446 characters.
127+
Escapes into:
128+
G0: ESC 2/4 2/8 4/5 ("$(E")
129+
G1: ESC 2/4 2/9 4/5 ("$)E")
130+
G2: ESC 2/4 2/10 4/5 ("$*E")
131+
G3: ESC 2/4 2/11 4/5 ("$+E")
132+
#168 is a "Japanese Graphic Character Set for Information Interchange"
133+
A multiple-byte set of 6879 characters updated from #87.
134+
Escapes into:
135+
G0: ESC 2/6 4/0 ESC 2/4 4/2 ("&@" "$B")
136+
G1: ESC 2/6 4/0 ESC 2/4 2/9 4/2 ("&@" "$)B")
137+
G2: ESC 2/6 4/0 ESC 2/4 2/10 4/2 ("&@" "$*B")
138+
G3: ESC 2/6 4/0 ESC 2/4 2/11 4/2 ("&@" "$+B")
139+
140+
The different registers reside at the following byte values:
141+
- C0: 0x00 - 0x1f
142+
- G0: 0x20 - 0x7f
143+
- C1: 0x80 - 0x9f
144+
- G2: 0xa0 - 0xff
145+
- G2 and G3: ???

include/asinine/asn1.h

+3-1
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,11 @@ typedef enum asn1_universal_type {
5353
ASN1_TYPE_SEQUENCE = 16,
5454
ASN1_TYPE_SET = 17,
5555
ASN1_TYPE_PRINTABLESTRING = 19,
56+
ASN1_TYPE_T61STRING = 20,
5657
ASN1_TYPE_IA5STRING = 22,
5758
ASN1_TYPE_UTCTIME = 23,
58-
ASN1_TYPE_GENERALIZEDTIME = 24
59+
ASN1_TYPE_GENERALIZEDTIME = 24,
60+
ASN1_TYPE_VISIBLESTRING = 26
5961
} asn1_universal_type_t;
6062

6163
typedef unsigned int asn1_type_t;

include/asinine/x509.h

+3-4
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,8 @@ typedef enum x509_algorithm {
3434
} x509_algorithm_t;
3535

3636
typedef struct {
37-
asn1_token_t common_name;
38-
asn1_token_t country_name;
39-
asn1_token_t organization;
40-
asn1_token_t organization_unit;
37+
asn1_token_t root;
38+
size_t num_rdns;
4139
} x509_name_t;
4240

4341
struct x509_cert {
@@ -55,6 +53,7 @@ void x509_cert_init(x509_cert_t *cert);
5553
x509_err_t x509_parse(x509_cert_t *cert, const uint8_t *data, size_t num);
5654
x509_err_t x509_validate(const x509_cert_t *cert);
5755

56+
bool x509_name_eq(const x509_name_t *a, const x509_name_t *b);
5857
#ifdef __cplusplus
5958
}
6059
#endif

src/asn1-types.c

+36-12
Original file line numberDiff line numberDiff line change
@@ -20,28 +20,48 @@ validate_string(const asn1_token_t *token)
2020
{
2121
const uint8_t *data;
2222

23-
if (asn1_is(token, ASN1_CLASS_UNIVERSAL, ASN1_TYPE_PRINTABLESTRING)) {
23+
if (token == NULL || token->class != ASN1_CLASS_UNIVERSAL) {
24+
return false;
25+
}
26+
27+
switch (token->type) {
28+
case ASN1_TYPE_PRINTABLESTRING:
2429
for (data = token->data; data < token->data + token->length; data++) {
25-
if (*data == ' ') {
30+
// Space
31+
if (*data == 0x20) {
2632
continue;
2733
}
2834

29-
if (*data < '\'' || *data > 'z') {
35+
// ' and z
36+
if (*data < 0x27 || *data > 0x7a) {
3037
return false;
3138
}
3239

33-
if (*data == '*' || *data == ';' || *data == '<' || *data == '>' ||
34-
*data == '@') {
40+
// Illegal characters: *, ;, <, >, @
41+
if (*data == 0x2a || *data == 0x3b || *data == 0x3c || *data == 0x3e
42+
|| *data == 0x40) {
3543
return false;
3644
}
3745
}
38-
} else if (asn1_is(token, ASN1_CLASS_UNIVERSAL, ASN1_TYPE_IA5STRING)) {
46+
break;
47+
48+
case ASN1_TYPE_IA5STRING:
49+
case ASN1_TYPE_VISIBLESTRING:
50+
case ASN1_TYPE_T61STRING:
3951
for (data = token->data; data < token->data + token->length; data++) {
40-
if (*data < 0 || *data > 127) {
52+
/* Strictly speaking, control codes are allowed for IA5STRING, but
53+
* since we don't have a way of dealing with code-page switching we
54+
* restrict the type. This is non-conformant to the spec.
55+
* Same goes for T61String, which can switch code pages mid-stream.
56+
* We assume that the initial code-page is #6 (ASCII), and flag
57+
* switching as an error. */
58+
if (*data < 0x20 || *data > 0x7f) {
4159
return false;
4260
}
4361
}
44-
} else if (asn1_is(token, ASN1_CLASS_UNIVERSAL, ASN1_TYPE_UTF8STRING)) {
62+
break;
63+
64+
case ASN1_TYPE_UTF8STRING: {
4565
enum {
4666
LEADING,
4767
CONTINUATION
@@ -89,7 +109,10 @@ validate_string(const asn1_token_t *token)
89109
}
90110
}
91111
}
92-
} else {
112+
break;
113+
}
114+
115+
default:
93116
return false;
94117
}
95118

@@ -196,7 +219,6 @@ asn1_time(const asn1_token_t *token, asn1_time_t *time)
196219
31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31
197220
};
198221

199-
const char * const end = ((char *)token->data) + token->length;
200222
const char *data = (char *)token->data;
201223

202224
union {
@@ -230,7 +252,7 @@ asn1_time(const asn1_token_t *token, asn1_time_t *time)
230252

231253
if (*data != 'Z') {
232254
// Try to decode seconds
233-
if (data + 2 >= end) {
255+
if (data + 2 >= (char*)token->end) {
234256
// Need at least another char for seconds, plus 'Z' or timezone
235257
return ASN1_ERROR_INVALID;
236258
}
@@ -416,5 +438,7 @@ asn1_is_string(const asn1_token_t *token)
416438
(token->class == ASN1_CLASS_UNIVERSAL) &&
417439
(token->type == ASN1_TYPE_PRINTABLESTRING ||
418440
token->type == ASN1_TYPE_IA5STRING ||
419-
token->type == ASN1_TYPE_UTF8STRING);
441+
token->type == ASN1_TYPE_UTF8STRING ||
442+
token->type == ASN1_TYPE_VISIBLESTRING ||
443+
token->type == ASN1_TYPE_T61STRING);
420444
}

src/tests/asn1-tests.c

+3-1
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ test_asn1_oid_to_string(void)
9898
const asn1_oid_t oid = ASN1_OID(1,2,3);
9999
const asn1_oid_t invalid_oid = ASN1_OID(1);
100100

101-
check(asn1_oid_to_string(&oid, oid_str, sizeof(oid_str)) == ASN1_OK);
101+
check(asn1_oid_to_string(&oid, oid_str, sizeof(oid_str)));
102102
check(strncmp("1.2.3", oid_str, 5) == 0);
103103

104104
check(!asn1_oid_to_string(&invalid_oid, oid_str, sizeof(oid_str)));
@@ -327,6 +327,8 @@ test_asn1_all(int *tests_run)
327327
{
328328
declare_set;
329329

330+
printf("sizeof asn1_token_t: %lu\n", sizeof(asn1_token_t));
331+
330332
run_test(test_asn1_oid_decode);
331333
run_test(test_asn1_oid_decode_invalid);
332334
run_test(test_asn1_oid_to_string);

src/tests/x509-tests.c

+21-6
Original file line numberDiff line numberDiff line change
@@ -9,20 +9,33 @@
99
#include "asinine/tests/certs.h"
1010

1111
static char*
12-
test_x509_parse(void)
12+
test_x509_certs(void)
1313
{
1414
x509_cert_t cert;
1515
size_t i;
1616
bool errors;
1717

1818
for (errors = false, i = 0; i < x509_certs_num; i++) {
19+
const char * const host = x509_certs[i].host;
1920
const uint8_t * const data = x509_certs[i].data;
2021
const size_t length = x509_certs[i].length;
2122

22-
if (x509_parse(&cert, data, length) != X509_OK) {
23-
errors = true;
24-
25-
printf("> %s (#%lu) failed to parse\n", x509_certs[i].host, i);
23+
switch (x509_parse(&cert, data, length)) {
24+
case X509_OK: {
25+
continue;
26+
}
27+
28+
case X509_ERROR_UNSUPPORTED: {
29+
printf("> %s (#%lu) uses unsupported features\n", host, i);
30+
errors = true;
31+
break;
32+
}
33+
34+
default: {
35+
printf("> %s (#%lu) failed to parse\n", host, i);
36+
errors = true;
37+
break;
38+
}
2639
}
2740
}
2841

@@ -36,7 +49,9 @@ test_x509_all(int *tests_run)
3649
{
3750
declare_set;
3851

39-
run_test(test_x509_parse);
52+
printf("sizeof x509_cert_t: %lu\n", sizeof(x509_cert_t));
53+
54+
run_test(test_x509_certs);
4055

4156
end_set;
4257
}

0 commit comments

Comments
 (0)