Editing Design

It was time to work out some use cases of how editing would actually work from the front end, and then to design how the back end will support it.

I wrote this file: src/dockb/services/README_editing.md

Note that I exposed a weakness in the design - the models will be database backed, so as they are changed, their state relative to the database needs to be maintained:

Are they "new", not in the database yet?
They could be "changed" - already in the database, but modified in memory.
And they could be "deleted" - previously in the database, but needing to be removed.

This means that we need to somehow keep track of the ones that are deleted = they can't just be GC'ed. Therefore, I asked the LLM to list all the ways that a child could be deleted, directly or indirectly. It replied:

Here are all the deletion pathways:
Direct single-child removal
1. collection.delete(child_id) — removes by ID from the DockbCollection, returns True/False, clears parent.
2. del collection[child_id] — same but raises KeyError on miss, also clears parent.
3. parent.delete_child(child_id) — model-level method that delegates to the local DockbCollection.delete(), then recurses into children (for Document, Chapter, Paragraph). Sentence is direct-only. Token always returns False.
Bulk removal
4. collection.clear() — removes all items from a DockbCollection, clears parent on each.
5. collection[:] = new_items — full-slice assignment, calls clear() then extend().
6. model.clear_semantics() — recursively clears the entire subtree below the model:
- Document → iterates chapters calling chapter.clear_semantics(), then chapters.clear()
- Chapter → iterates paragraphs calling paragraph.clear_semantics(), then paragraphs.clear()
- Paragraph → iterates sentences calling sentence.clear_semantics(), then sentences.clear()
- Sentence → tokens.clear()
- Token → no-op (leaf)

Adding in code to handle all these would be awkward and if the logic is changed, may not cover all cases in the future. The DockbModelBase has a set_parent() method which is always called to set the parent (even to None in the case of deletion.)

I added tests that assert that (an as yet non-existent) on_delete() method is fired when the parent is set to None from something. When I say "I added tests..." I told the LLM all about my thoughts as stated above, leaving it in plan mode. Then I switched it to build mode and told it to write the tests, but to wait for me, because I will likely change the tests...

...and it made a mistake. It added a test to make sure that a bug was asserted. I instructed it:

test_append_upsert_does_not_call_on_deleted() is a wrong test, because when the new child is
added with the same key, the previous one is in fact deleted. The collection does not call the set_parent
to None for the one that was previously in the list. This is a bug. It needs to be fixed, and this test
needs to be changed (in name and logic) so that it checks that on_deleted was called in this case.

It was not testing for when the on_deleted() should not be called:

A test is needed for when the model object is not parented yet, and if it's set_parent
is called with None. In this case, on_deleted should not be called

Doing TDD and using the LLM to assist writing the tests, carefully supervised, works very well. It reduces the level of tedium that usually oppresses our mood. The creative part is designing how it will work. Writing the tests expresses that, but it is tedious. We can use the LLM to reduce the tedium, as long as we keep a very close eye on what it has done.

Next, I wrote the design into src/dockb/services/README_data_states.md for the extra states:

new
deleted
changed

I wrote into the design document:

## Automatically setting the state

When a model is un-parented, then its state will become DELETED, unless it was
previously NEW, in which it becomes _ (Nothing)

In any of the cases below which represent semantic hierarchy changes,
its state becomes CHANGED, unless it was previously NEW, in which case it remains NEW.
- it is reparented
- any of its children are removed
- one of more children are added
- the order of its children are changed

A model becomes NEW when it has been created and receives data, either text or children.

Then I prompted the LLM as so:

@src/dockb/services/README_data_states.md discuss with me when to set the model state to CHANGED, automatically. You will need to have the code automatically sensing when the hierarchy changes.

Its response was basically that DockbCollection needs to be able to call an on_changed callback of some sort, which makes very good sense. It also highlighted some edge cases.
However, I could see a number of gaps in its analysis, and this highlighted to me the need to write the tests. It did confirm to me that most of the logic will be in the DockbCollection, calling an abstract method "on_changed" in the DockbModelBase abstract class. Some logic will also be in the models themselves.

I noticed that the logic in the DockbModelBase class regarding parenting and unparenting had a weakness (causing one edge case the LLM identified), so I made changes to ensure that in the case of a re-parent:

The previous parent will no longer contain the child, and
The on_deleted() callback won't be called.

Then I prompted the LLM:

I've made some changes to @src/dockb/models/utils/dockb_collection.py in the
DockbModelBase class's set_parent method, and also everywhere in DOckbCollection that calls it.
Note the comments in the method, and the new logic. Check that the code matches the comment,
and also look at the modified and new tests in
@tests/dockb/models/utils/test_dockb_collection.py#L362,375 and check that the logic is now well tested.

I then went through a few iterations of testing and fixing, and prompting the LLM similarly. It found a few holes in my logic. It may have been quicker just to let it do the work, but this logic was central and very important to get right. Though I myself got it wrong initially, I understand what I'm trying to achieve, and it was just too difficult to explain it all to the LLM.
However, if I'm being honest, I think I was suffering "LLM fatigue" - from specifying, specifying, specifying, (with Claude at work), and I just wanted to write some code.

I did some more work with the LLM to tidy up the code, which is not interesting enough to talk about here, and then decided it was time to link the models to a database.
I prompted it:

Given our models (Document -> Chapter -> Paragraph -> Sentence -> Token) as described already in @src/dockb/models/README.md - what is the most natural way to describe that so that the model objects can be stored in and retrieved from Neo4j, maintaining their semantic relationships?

It responded with some semi-useful information for mapping the models using Neo4j's Cypher language. I'm not very familiar with Cypher, so I started an interactive discussion with the BigPickle LLM, which I'll describe in the next page.