So far so good...
Now I have the Document model created by the LLM. It was easy. It took me a lot longer to write this commentary, than it did to write the code and set the LLM working.
It worked well. The process so far is good, with a caution: make sure you write good tests.
What next?
I plan to make the services and then the controllers, the NLP code that will work out the semantics of the text (in a queue). Then also the integration with repositories to Neo4J... and then the front end.
Initially though, the rest of it will need all the models. I will write the classes with empty functions, I'll write the tests, and this time use a better prompt.
I started to fill out the Chapter class with empty functions, and created a very empty Paragraph class, then realised that the LLM can create the other classes for me because they are basically carbon copies.
Similarly, I can get it to create the tests... Hmmm. I said we should not let the LLM create tests, but my prompt will tell it to create the classes using the same ideas as in Document, and then get it to duplicate the tests for the new classes. They will be the same as the tests I created. I'm just getting it to do the duplication.
The following started as a prompt, then became a design document:
The system of models is a hierarchy of classes as so:
Document has a list of Chapter objects.
Chapter has a list of Paragraph objects.
Paragraph has a list of Sentence objects.
Sentence has a list of Phrase objects.
Phrase has a list of Token objects.
A token is either a word, or punctuation,
or white space, and white space can be any
combination of one or more space, newline, or
tab character.
A Token will be enumerated as either:
- TOKEN_IS_WORD
- TOKEN_IS_PUNCTUATION
- TOKEN_IS_WHITESPACE
and the text of the Token will follow the limitations
imposed by its enumerated type.
A Token does not have a list of any other objects.
So I copied that into a file in the design_documents directory. Then I referenced it in the prompt.
That worked very well. I created an almost empty Token class, then, because I'm lazy, I asked it to create me the enum class as per the docs.
Then I created a stub Token class with comments, and wrote the test_token.py and prompted it to fill in the blanks.
This worked very well, took hardly any time, except the time I spent writing up the description of the Token class. I admit that I changed the design of the Token class half-way through and asked the LLM to fix up the documentation and the tests.
Wasn't I was saying, "Don't let the LLM write tests"? This is an exception to my rule, I'm not a dictator.