An Arduous Endeavor (Part 6): Save States and Rewind

Rewind Button
This entry is part 6 of 7 in the series An Arduous Endeavor

Two emulator features that have made my gaming experience much more comfortable are save states and rewind functionality. Save states let you take a snapshot of a game at any point in time, and then come back to it – whether that is after attempting and failing to complete some task, or maybe after leaving the game for a bit to take care of things in real life. Rewind takes that feature to the extreme, essentially performing a snapshot after every video frame, making it very convenient to “undo” a misstep or bad random number generator without even having to plan for it.

Libretro enables both of these features via the retro_serialize and retro_deserialize functions. Once you have these functions implemented, a Libretro frontend is capable of saving/loading any number of save files.

The only requirements are that you have to be able to report the required memory size in bytes, via the retro_serialize_size function. Due to the rapid-fire nature of calling retro_serialize for the rewind feature, it makes sense to keep this size as small as possible.

For Arduous, I just needed to figure out what data within the CPU and screen emulators needed to be saved off. This was mainly tedious more than complicated.

size_t Arduous::getSaveSize() {
    size_t size = sizeof(int)                  // cpu->state
                  + sizeof(avr_cycle_count_t)  // cpu->cycle
                  + sizeof(avr_cycle_count_t)  // cpu->run_cycle_count
                  + sizeof(avr_cycle_count_t)  // cpu->run_cycle_limit
                  + sizeof(uint8_t) * 8        // cpu->sreg
                  + sizeof(int8_t)             // cpu->interrupt_state
                  + sizeof(avr_flashaddr_t)    // cpu->pc
                  + sizeof(avr_flashaddr_t)    // cpu->reset_pc
                  + cpu->ramend + 1            // cpu->data

                  + sizeof(ssd1306_virt_cursor_t)                                // screen->cursor
                  + sizeof(uint8_t) * SSD1306_VIRT_PAGES * SSD1306_VIRT_COLUMNS  // screen->vram
                  + sizeof(uint16_t)                                             // screen->flags
                  + sizeof(uint8_t)                                              // screen->command_register
                  + sizeof(uint8_t)                                              // screen->contrast_register
                  + sizeof(uint8_t)                                              // screen->cs_pin
                  + sizeof(uint8_t)                                              // screen->di_pin
                  + sizeof(uint8_t)                                              // screen->spi_data
                  + sizeof(uint8_t)                                              // screen->reg_write_sz
                  + sizeof(ssd1306_addressing_mode_t)                            // screen->addr_mode
                  + sizeof(uint8_t)                                              // screen->twi_selected
                  + sizeof(uint8_t)                                              // screen->twi_index
        ;
    return size;
}

Once I identified all the pieces involved in state and how many bytes they’d take up, then I just needed to actually copy those pieces into/out of the allocated buffer. Here’s just a snippet of that (it gets repetitive quickly):

bool Arduous::save(void* data, size_t size) {
    auto* buffer = static_cast<uint8_t*>(data);
    memcpy(buffer, &cpu->state, sizeof(int));
    buffer += sizeof(int);
    memcpy(buffer, &cpu->cycle, sizeof(avr_cycle_count_t));
    buffer += sizeof(avr_cycle_count_t);
    memcpy(buffer, &cpu->run_cycle_count, sizeof(avr_cycle_count_t));
    buffer += sizeof(avr_cycle_count_t);
    memcpy(buffer, &cpu->run_cycle_limit, sizeof(avr_cycle_count_t));
    buffer += sizeof(avr_cycle_count_t);
    memcpy(buffer, cpu->sreg, sizeof(uint8_t) * 8);
    // etc, etc
}

There’s probably a much more code-efficient way to do this, as this particular method involves a lot of copy and paste. I will focus on optimizing later – for now, though, this sort of works. Sort of. Watch the clip below and see if you can spot the problem.

Saving/loading states appears to work all right, at least for the random spot check tests I ran. It even works if you exit out of the emulator and reopen it! However, when attempting to rewind, things get messed up pretty quickly. I believe this has to do with me failing to serialize the timers and/or interrupts in the simulated AVR struct. And I am not totally sure about the best way to serialize these, because in addition to state data, they happen to contain some callback function pointers, which will definitely NOT be valid if the emulator is restarted. Here are the relevant data structures relating to timers:

/*
 * Each timer instance contains the absolute cycle number they
 * are hoping to run at, a function pointer to call and a parameter
 * 
 * it will NEVER be the exact cycle specified, as each instruction is
 * not divisible and might take 2 or more cycles anyway.
 * 
 * However if there was a LOT of cycle lag, the timer migth be called
 * repeteadly until it 'caches up'.
 */
typedef struct avr_cycle_timer_slot_t {
	struct avr_cycle_timer_slot_t *next;
	avr_cycle_count_t	when;
	avr_cycle_timer_t	timer;
	void * param;
} avr_cycle_timer_slot_t, *avr_cycle_timer_slot_p;

/*
 * Timer pool contains a pool of timer slots available, they all
 * start queued into the 'free' qeueue, are migrated to the
 * 'active' queue when needed and are re-queued to the free one
 * when done
 */
typedef struct avr_cycle_timer_pool_t {
	avr_cycle_timer_slot_t timer_slots[MAX_CYCLE_TIMERS];
	avr_cycle_timer_slot_p timer_free;
	avr_cycle_timer_slot_p timer;
} avr_cycle_timer_pool_t, *avr_cycle_timer_pool_p;

Understanding how best to serialize these will require digging through some simavr internals. I found a post on Stack Overflow that offered a bit of a hint, pointing to the idea of creating a function registry that is populated each time the program starts up. If there are a discrete number of functions and param objects that could be assigned to timers this could be workable, but it will take some digging.

Yet another possibility is switching AVR simulators, or extending simavr to have built-in support for serialization, or writing my own simulator with an eye toward keeping the entire state “simply” serializable. I originally started this project expecting to do that, but that would be a significantly greater undertaking given the breadth of hardware functionality in the ATMega32u4 package. I’m not ruling it out as a possibility, though.

Series Navigation<< An Arduous Endeavor (Part 5): Buzzes and BeepsAn Arduous Endeavor (Part 7): Automated Builds >>

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.