Little Python is a restricted subset of Python 3. (and 2.7)
This is a work in progress. The implementation does not yet match this spec. As a result, the grammar will be slightly bogus. You hopefully get the idea though.
Simple:
Harder:
Keywords: "and", "not", "or",
"True", "False",
"class", "def", "yield", "return",
"while", "for", "in", "if", "elif", "else", "break", "continue",
"from", "import",
"pass",
"print"
Punctuation: ',' '(' ')' ':' '*' '/' '+' '-' '**' **[TBD]**
COMPARISON_OPERATOR **[TBD]**
ASSIGN
COMPARISON_OPERATOR: (<|>|==|>=|<=|<>|!=|in|not +in|is|is +not)
ASSIGN: '='
Structural: EOL INDENT DEDENT
EOL -- Should be logical, actually '\n'
INDENT -- emitted after increased number of leading spaces after EOL
DEDENT -- emitted after decreased number of leading spaces after EOL
Literals: IDENTIFIER NUMBER STRING
IDENTIFIER: [a-zA-Z_][a-zA-Z0-9_]*
NUMBER: BINARY OCTAL HEX FLOAT INTEGER
BINARY -- 0b\d+
OCTAL -- 0o\d+
HEX -- 0x([abcdef]|\d)+
FLOAT -- \d+\.\d+
INTEGER -- \d+
STRING: DQUOTESTRING | SQUOTESTRING
DQUOTESTRING: "([^"]|.)*"
SQUOTESTRING: '([^']|.)*'
CHARACTER: SCHARACTER | DCHARACTER
SCHARACTER: c'([^']|.)'
DCHARACTER: c"([^"]|.)"
I'm actually contemplating having b'<char>' instead, but that
makes single character byte string tricky. This will probably
be revisited, but one thought is this: If a single character
byte string is actually required, do this: b'C'+b'' - ie append
an empty byte string. The compiler will be special cased to
detect this and force the expression to be the single bytestring
b'C'. It's a bit icky, so for the moment I've added a character
literal instead to see what works better.
This isn't ideal, but it deals with the fact that often we do
want to be able to deal with just characters C in embedded
systems.
program : statements
statements : statement
| statement statements
statement_block : INDENT statements DEDENT
statement : EOL
| assignment_statement
| general_expression **[PARTIAL]**
| while_statement
| break_statement
| continue_statement
| if_statement
| for_statement
| import_statement **[TBD]**
| class_statement **[TBD]**
| def_statement **[TBD]**
| return_statement **[TBD]**
| yield_statement **[TBD]**
| pass_statement
NB Previously this included a print_statement. This is now a function call, ala python 3.
Note: general_expression [PARTIAL] means we have parsing of general expressions but not all types have appropriate functionality yet
Open question:
(These are open questions because they are harder to implement on some levels, assert would be useful though, but more useful if try/except were implemented)
pass_statement : PASS
class_statement : CLASS PARENL ident_list PARENR COLON EOL class_block
class_block : INDENT class_statementlist DEDENT
class_statementlist : def_statement
| assignment_statement
def_statement : DEF identifier PARENL PARENR COLON EOL statement_block
| DEF identifier PARENL ident_list PARENR COLON EOL statement_block
yield_statement : YIELD general_expression
return_statement : RETURN
| RETURN general_expression
ident_list : identifier
| identifier COMMA ident_list
assignment_statement : identifier ASSIGN general_expression
import_statement : FROM identifier IMPORT identifier
| IMPORT identifier
All of these have been done - to a BARE level. That is:
In for_statement general_expression is required to be an iterator function listed in pyxie.model.functions.builtins. This is currently just the function range with an interator implementation in clib. This is however the "correct" structure, not a sidestep.
In while_statement this is a general expression expected to evaluate to a bool.
for_statement : FOR identifier IN general_expression COLON EOL statement_block
while_statement : WHILE general_expression COLON EOL statement_block
break_statement : BREAK continue_statement : CONTINUE
if_statement : IF general_expression COLON EOL statement_block
| IF general_expression COLON EOL statement_block extended_if_clauses
extended_if_clauses : else_clause
| elif_clause
elif_clause : ELIF general_expression COLON EOL statement_block
| ELIF general_expression COLON EOL statement_block extended_if_clauses
else_clause : ELSE COLON EOL statement_block
general_expression : boolean_expression
boolean_expression : boolean_and_expression
| boolean_expression OR boolean_and_expression
boolean_and_expression : boolean_not_expression
| boolean_and_expression AND boolean_not_expression
boolean_not_expression : relational_expression
| NOT boolean_not_expression
relational_expression : relational_expression COMPARISON_OPERATOR expression
| expression
NOTE: Not all types are valid yet, and truthiness needs implementing
expression : arith_expression **[WIP]**
| arith_expression PLUS expression
| arith_expression MINUS expression
| arith_expression POWER expression **[TBD]**
arith_expression : expression_atom
| expression_atom TIMES arith_expression
| expression_atom DIVIDE arith_expression
expression_atom : value_literal
| func_call
| PARENL general_expression PARENR
value_literal : number
| identifier
| string
| character
| boolean
Note: These are done for ints, floats, and for some strings. ("hello"+"world" for example using std::string)
The lack of strings is why it's not listed as done
number : INTEGER
| FLOAT
| HEX
| OCTAL
| BINARY
| MINUS number
string : STRING
character : CHARACTER
boolean : TRUE | FALSE
identifier : IDENTIFIER
func_call : IDENTIFIER PARENL PARENR
| IDENTIFIER PARENL expr_list PARENR
expr_list : general_expression
| general_expression COMMA expr_list
Pyxie is intended to interact with C++, in that it compiles to C++ targetting embedded systems. To that purpose it is useful to be able to pass through commands to C++. In particular the pass through ONLY supports #include pre-processor directives.
The way this is done is through python comments, so for example this is legal:
#include <stdio.h>
As is this:
#include <Arduino.h>
By definition this does not support every aspect that might be needed, but it's a useful start.
Lexical analyser has the following states:
(The changelog is a better place to look as to what specifically has been done)
Dictionaries, dictionary literals [TBD]
function definitions with an optional argument list [TBD]
Note: Operator precedence needs ironing out [TBD]