I want to expand the interpreter to understand numbers - both as tokens directly in the code (literals) and as values in variables. So that LET A = 123
no longer sets the variable A
to the string "123"
but to the integer 123
. And so that LET B = A
no longer sets the variable B
to the string "A"
but to the value of the variable A
, in this case the integer 123
.
This is probably going to be a bit of work - first I have to let the tokenizer understand numbers, so that LET A = 123
becomes LET <identifier ‘A’> <equal> <number 123>.
That means that the nextToken function has to be extended so any digit begins s number token, and not an identifier token
flowchart LR
A(read character)
A --> B{ }
B -->|whitespace| C([begin **whitespace** token])
B -->|digit 0-9| N([begin **number** token])
B -->|#quot;| D([begin **string** token])
B -->|=| E([**equal sign** token])
B -- letter --> I([begin **identifier** token])
By the way, I’m going to use the builtin functions in ctype.h to check for the different types - no longer use if (c == ' ' || c == '\\t')
but simply if (isspace(c))
- and to check for digits, if (isdigit(c))
.
So if the character is a digit, we continue to tokenize_number
, that will continue as long as there are more digits. Almost the exact same as tokenize_whitespace
:
---
title: Number
---
flowchart LR
a(next character)
a-->b{ }
b-->|digit 0-9|f
b-->|anything else|g([end token])
f(continue token)-.->a
And of course I need to add number as a token type:
typedef enum
{
WHITESPACE,
NUMBER,
STRING,
IDENTIFIER,
EQUALS,
END
} TOKEN_TYPE;
static const char *TOKEN_NAMES[] = {
"WHITESPACE",
"NUMBER",
"STRING",
"IDENTIFIER",
"EQUALS",
"END"};
And that should be it - there really is nothing to it - so lets test it!
Great, an error - so it works … It works that we get a number - now we just need to be able to use numbers. Let’s take a look at LET
first!
Earlier I allowed the variable value to be set to the value of either a string or an identifier - simply changing that to string or number should fix that!
if(token->type == STRING || token->type == NUMBER)
{
value = token->value;
}
Test it by creating a variable with the value of a number, and printing it:
Cool, that already works - and the variable can be printed because a
is an identifier, so PRINT
doesn’t know that it is a number.
Now it should also be possible to handle <identifiers> correctly, so I can write let b = a, and thus set the value of b to 1234. Just add a special if-statement: