Welcome, Guest!

Here are some links you may find helpful

Dreamcast Inside Agartha

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
So this is the next tentative dive for me. For those who those unfamiliar, I spent some time pulling apart the Castlevania: Resurrection proto here in this thread. It was a lot of fun, so I thought I would do another. I spoke with @Sega Dreamcast Info, and he informed me that Agartha might have a few secrets worth uncovering, so I started poking around.

Now most of the work I've done so far on it is just preliminary. To start off, I'm focusing on the April 18th release, which may be found here. The first obstacle was just getting the image mounted on my Linux machine. CDI images are proprietary, so usually I just convert them into an iso file and mount will be happy, but these images would simply not convert using any software that I knew. I finally managed to get the contents open using gditools, which I think is written by members of this forum? if so - thanks!

So on the filesystem root we have a bunch of files:

256344068 Apr 17 2001 000DUMMY.DAT 256344068 Apr 17 2001 0AGARTHA_DEMO2001.MPG 1105126 Apr 17 2001 1AGARTHA.BIN 3458 Mar 9 2001 ADXLISTDEMO.LST 36019 Apr 17 2001 AGARTHA.LST 13144526 Apr 17 2001 AGARTHA.PAK 18608 May 18 2000 AUDIO64.DRV 1021590 Mar 8 2001 BODNATH_EXTR01.ADX 1364634 Mar 8 2001 BODNATH_EXTR02.ADX 1311174 Feb 28 2001 BODNATH_EXTR08.ADX 1616130 Feb 28 2001 BODNATH_EXTR09.ADX 2110518 Feb 28 2001 GOKYO_EXTR01.ADX 1170810 Feb 28 2001 GOKYO_EXTR07.ADX 3225258 Feb 28 2001 GOSAINKUND_MOD_NIV_SHORT.ADX 1943910 Feb 28 2001 KALAPATHAR_EXTR01.ADX 1553346 Feb 28 2001 KALAPATHAR_EXTR03.ADX 992394 Feb 28 2001 KALIGANDAKI_EXTR03.ADX 1037214 Feb 28 2001 KALIGANDAKI_EXTR07.ADX 1254042 Feb 28 2001 KALIGANDAKI_EXTR09.ADX 1122066 Feb 28 2001 KALIGANDAKI_EXTR11.ADX 5826294 Feb 28 2001 KANCHENJUNGA_EQ.ADX 1533150 Feb 28 2001 KANCHENJUNGA_EXTR09.ADX 1560114 Feb 28 2001 KANCHENJUNGA_EXTR11.ADX 1130742 Feb 28 2001 KANCHENJUNGA_EXTR13.ADX 2034630 Feb 28 2001 KIANGJING_EXTR03.ADX 219474 Feb 28 2001 KIANGJING_EXTR06.ADX 1721718 Feb 28 2001 KIANGJING_EXTR07.ADX 860022 Feb 28 2001 LADAKH_EXTR02.ADX 2196882 Mar 9 2001 LOUPIOTV3.ADX 2159300 Feb 26 2001 MOULIN.F3D 1739538 Feb 28 2001 PE03_22_EXTR01.ADX 4096 May 15 18:24 RESS

The first interesting bit is the .mpg file, which is a 15 minute long video of the game, which is posted on @Sega Dreamcast Info's site. The 000DUMMY.DAT is the same file. I guess it's there to pad out the disk. Weird, but I've seen weirder. The ADX files are all music tracks, best as I can tell. They are playable using VLC or ffplay, but some seem broken or unsupported. The RESS directory is curiously sparse, so this means the assets are hiding somewhere... but where?

AGARTHA.PAK, of course.

However, when I popped it open in my hex editor it was immediately apparent this would be a more difficult task. The PAK file structure has no file manifest or anything, it's essentially just a big blob of data, yet there must be some way of mapping files to offsets somewhere. That's when I looked at AGARTHA.LST.

AGARTHA.LST is a big text file, with a bunch of file names inside it, one per line. This got me thinking, so I went back to my hex editor and sure enough, at the top of the file was a big blob of uints, and more interestingly, their values were all monotonic. I started jumping around inside the PAK file, treating each uint as an offset; sure enough, at each location was what appeared to be different files, and in the case of the PVR textures, their headers all lined up with the file names. This was the first important discovery in the journey of unlocking this treasure chest: The AGARTHA.LST contains a list of all files inside the AGARTHA.PAK; each file is written inside the PAK in the order it appears in the LST. There are 550 files total.

Code:
//Agartha.PAK template
LittleEndian ();

//one per entry in AGARTHA.LST
SetBackColor (cPurple);
const uint NUM_FILES = 550;
uint offsets[NUM_FILES];

//before each file
struct File
{
    uint32 uncompressed;
    uint32 compressed;
    uint16 flags;
    byte data[compressed];
};

local uint i = 0;
while (i < NUM_FILES)
{
    FSeek (offsets[i]);
    if (i%2 != 0) SetBackColor (cLtPurple);
    else SetBackColor (cDkPurple);
    File file;
    i++;
}

How ever, there's another big snag: the files are all compressed in what appears to be some form of LZW, or more probably, LZSS encoding. The give away here is that the early parts of the file are quite legible, but by the end of each file it's a binary mess. So in order to continue with the task of reversing Agartha, I have to figure out the decompression scheme. fortunately, LZW/LZSS isn't exactly black magic, but there are a ton of variations and at the moment I'm unsure of how to continue, so I have to ask: does anyone got any tips on reversing compression algorithms?
 
Last edited:

FamilyGuy

2049 Donator
Donator
Registered
Joined
May 31, 2019
Messages
315
Reaction score
309
Points
63
AG User Name
-=FamilyGuy=-
AG Join Date
March 3, 2007
I finally managed to get the contents open using gditools, which I think is written by members of this forum? if so - thanks!
It's me and you're welcome!

You can convey adx to wav with very simple command line utilities, probably adx2wav or similar. There's also a Winamp plugin iirc.

@yzb is probably the active person who's the most experienced with DC on the fly compression.
 

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
You can convey adx to wav with very simple command line utilities, probably adx2wav or similar.

Ah yeah, my mistake. I meant to write some files seem corrupt. it seems about half are playable or convertable, but the rest just don't fly. I'm unsure of the reason for this at present.
 

FamilyGuy

2049 Donator
Donator
Registered
Joined
May 31, 2019
Messages
315
Reaction score
309
Points
63
AG User Name
-=FamilyGuy=-
AG Join Date
March 3, 2007
Ah yeah, my mistake. I meant to write some files seem corrupt. it seems about half are playable or convertable, but the rest just don't fly. I'm unsure of the reason for this at present.
There's a few options for encoding ADX (channels, sampling rate, etc) and I'm not sure VLC supports it all. It's been a long time since I messed around with ADX and AHX, but IIRC ADX is essentially a custom ADPCM format with looping support. Maybe VLC can make sense of them in some cases only.

Also it seems like the implementation of the ffmpeg (libavcodec, used by VLC) decoder is locked to 44.1 kHz sampling rate, and ADX are often 22,050 Hz in DC games.

It's worth hunting down adx2wav and convert those non working ones to wav just to be sure.
 
Last edited:

Anthony817

Registered
Registered
Community Contributor
Joined
Jun 2, 2019
Messages
374
Reaction score
494
Points
63
AG Join Date
May 12, 2010
It was a rare treat to see somebody dig as deeply as you have on the Castlevania disc, so really excited to see what you find here!
 

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
It's worth hunting down adx2wav and convert those non working ones to wav just to be sure.
I gave it a try, and about half of the ADX files still fail to convert. I popped them open in the hex editor and it looks like the ones that fail are completely different formats all together. I looked up the ADX format to be sure, and they definitely do not match any known version of it. I did find something that might be interesting, though, see attached image. The valid ADX files do not have this section, so I wonder what's going on here.

I'll try a hand at cracking the decompression this weekend. It's going to feel great once we can get this treasure chest opened up.

It was a rare treat to see somebody dig as deeply as you have on the Castlevania disc, so really excited to see what you find here!
Thank you! It was a rare treat to work on it! I hope more builds surface in the future.
 

Attachments

  • hex.png
    hex.png
    207.2 KB · Views: 0

FamilyGuy

2049 Donator
Donator
Registered
Joined
May 31, 2019
Messages
315
Reaction score
309
Points
63
AG User Name
-=FamilyGuy=-
AG Join Date
March 3, 2007
I gave it a try, and about half of the ADX files still fail to convert. I popped them open in the hex editor and it looks like the ones that fail are completely different formats all together. I looked up the ADX format to be sure, and they definitely do not match any known version of it. I did find something that might be interesting, though, see attached image. The valid ADX files do not have this section, so I wonder what's going on here.

I'll try a hand at cracking the decompression this weekend. It's going to feel great once we can get this treasure chest opened up.


Thank you! It was a rare treat to work on it! I hope more builds surface in the future.
Yeah, that doesn't look like a sound file. The included path reminds me of a filesystem (Like iso9660 that includes metadata in the header iirc, although this is clearly not an ISO) or compiled code with metadata. Maybe an AFS or tarball in disguise? Although in that case all characters would be ASCII IIRC.

Anyways, that's interesting, keep us posted!
 
Last edited:

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
Alright, I sat down and wrote out a script to dump the PAK file. So far it recreates the file structure and dumps the raw binaries to disk, but the compression remains a work in the progress. I did however learn that files are stored in at least 3 different modes, one of which is uncompressed. It's unclear if this specifies a compression quality setting or different algorithm at the moment.

Here's the script for anyone interested in examining the raw data:
Python:
#!/usr/bin/env python3
from struct import unpack, calcsize
import os

def uncompress (data):
    return data

def main ():
    PREFIX = 'AGARTHA'
    DEST = 'contents'
 
    #Load the manifest
    with open (PREFIX + '.LST', 'rb') as f:
        count = 0
        lines = f.readlines ()
   
        #Build a list of files
        files = []
        for ln in lines:
            #Remove blanks and comments
            ln = ln.decode ('latin').strip ().lower ().replace ('/', '\\')
            if '' == ln:
                continue
            if '#' == ln[:1]:
                continue
            #Insert the path into the list for later
            files.append (ln)
            count += 1
   
    #Figure out the common prefix
    pref = os.path.commonprefix (files).replace ('\\', '/')
    print (f'Common Prefix: "{pref}"')
 
    #Build a path list...
    paths = []
    for p in files:
        sanitised = os.path.normpath (p.replace ('\\', '/'))
        paths.append (sanitised.replace (pref, ''))
 
    #Extract files from archive...
    with open (PREFIX + '.PAK', 'rb') as f:
        offsets = unpack (f'<{count}I', f.read (calcsize (f'<{count}I')))
        for i in range (count):
            #Rebuild path on disk
            fn = os.path.basename (paths[i])
            dn = os.path.join (DEST, os.path.dirname (paths[i]))
            os.makedirs (dn, exist_ok = True)
       
            #Seek to find and read header
            f.seek (offsets[i])
            uncompressed, compressed, mode = unpack ('<IIH', f.read (calcsize ('<IIH')))
       
            #Read and uncompress contents as needed
            rate = 100*compressed/uncompressed
            MODES = ['uncompressed', 'UNK0', 'UNK1']
            print (f'Uncompressing "{paths[i]}", ratio: {rate:.4}% ({MODES[mode]})')
            data = uncompress (f.read (compressed))
       
            #Write it to disk and free data
            with open (os.path.join (dn, fn), 'wb') as out:
                out.write (data)
            del data
           
if __name__ == "__main__":
    main ()

Pretty good progress for now, but for the next few weeks I have to work 50-55 hours, so expect it to be quite slow

Addendum:

The encoding is definitely LZSS, at least the 0x02 mode. By complete serendipity I found out that it follows a similar scheme to the algorithm used in Allegro, where a control byte is issued for every 8 tokens, where a token may be a 1 byte literal or a 2 byte base/length pair. So at most there may be 16 bytes of data between control bytes. I haven't quite got it working though.

Addenum II:

I wrote out a routine to decompress the LZSS stuff as described above. It works for files encoded using Allegro, but only works partially for Agartha, so this seems to confirm to me that Agartha uses a variant. One observation I noticed is that I have to flip the byte order of the base/length pairs to get the proper number of bytes to decode, but I haven't quite managed to exactly get the proper bytes out of it. Single byte literals decompress perfectly, so there's no question about the over all scheme being similar to Allegro. It's just a matter of how to interpret the base index...

Routine for decompression:
C:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define SIZE 4096
#define EXTRA 18
#define MASK (SIZE - 1)

int main (int argc, char **argv)
{
    if (argc < 2)
    {
        return -1;
    }
    /*Load the file into memory*/
    FILE *fp = fopen (argv[1], "rb");
    if (NULL == fp)
    {
        return -1;
    }
    size_t size = 0;
    fseek (fp, 0, SEEK_END);
    size = ftell (fp);
    fseek (fp, 0, SEEK_SET);
  
    char *data = malloc (size);
    fread (data, 1, size, fp);
    fclose (fp);
  
    /*Decompress the contents, writing to stdout*/
    char ring[SIZE + EXTRA - 1];
    unsigned int r = SIZE - EXTRA;
    unsigned int ctl = 0;
    size_t pos = 0;
    int decoded = 0;
  
    memset (ring, 0, sizeof (ring));
    while (pos < size)
    {
        if (0 == (ctl&256))
        {
            if (pos >= size)
            {
                break;
            }
            ctl = data[pos++]&0xff;
            ctl |= 0xff00;
        }
        if (ctl&1)
        {
            putc (data[pos], stdout);
            ring[r&MASK] = data[pos];
            decoded++;
            pos++;
            r++;
        }
        else
        {
            int b0 = data[pos + 0]&0xff;
            int b1 = data[pos + 1]&0xff;
            int base = b0|((b1&0xf0)<<4);
            int length = (b1&0x0f) + 3;
            for (int i = 0; i < length; i++)
            {
                int c = ring[(base + i)&MASK];
                putc (c, stdout);
                ring[r&MASK] = c;
                decoded++;
                r++;
            }
            pos += 2;
        }
        ctl >>= 1;
    }
  
    printf ("\n\ndecoded: %i\n", decoded);
    return 0;
}

If anyone wants to play along feel free! The more help the merrier.
 
Last edited:

FamilyGuy

2049 Donator
Donator
Registered
Joined
May 31, 2019
Messages
315
Reaction score
309
Points
63
AG User Name
-=FamilyGuy=-
AG Join Date
March 3, 2007
Alright, I sat down and wrote out a script to dump the PAK file. So far it recreates the file structure and dumps the raw binaries to disk, but the compression remains a work in the progress. I did however learn that files are stored in at least 3 different modes, one of which is uncompressed. It's unclear if this specifies a compression quality setting or different algorithm at the moment.

Here's the script for anyone interested in examining the raw data:
Python:
#!/usr/bin/env python3
from struct import unpack, calcsize
import os

def uncompress (data):
    return data

def main ():
    PREFIX = 'AGARTHA'
    DEST = 'contents'
 
    #Load the manifest
    with open (PREFIX + '.LST', 'rb') as f:
        count = 0
        lines = f.readlines ()
   
        #Build a list of files
        files = []
        for ln in lines:
            #Remove blanks and comments
            ln = ln.decode ('latin').strip ().lower ().replace ('/', '\\')
            if '' == ln:
                continue
            if '#' == ln[:1]:
                continue
            #Insert the path into the list for later
            files.append (ln)
            count += 1
   
    #Figure out the common prefix
    pref = os.path.commonprefix (files).replace ('\\', '/')
    print (f'Common Prefix: "{pref}"')
 
    #Build a path list...
    paths = []
    for p in files:
        sanitised = os.path.normpath (p.replace ('\\', '/'))
        paths.append (sanitised.replace (pref, ''))
 
    #Extract files from archive...
    with open (PREFIX + '.PAK', 'rb') as f:
        offsets = unpack (f'<{count}I', f.read (calcsize (f'<{count}I')))
        for i in range (count):
            #Rebuild path on disk
            fn = os.path.basename (paths[i])
            dn = os.path.join (DEST, os.path.dirname (paths[i]))
            os.makedirs (dn, exist_ok = True)
       
            #Seek to find and read header
            f.seek (offsets[i])
            uncompressed, compressed, mode = unpack ('<IIH', f.read (calcsize ('<IIH')))
       
            #Read and uncompress contents as needed
            rate = 100*compressed/uncompressed
            MODES = ['uncompressed', 'UNK0', 'UNK1']
            print (f'Uncompressing "{paths[i]}", ratio: {rate:.4}% ({MODES[mode]})')
            data = uncompress (f.read (compressed))
       
            #Write it to disk and free data
            with open (os.path.join (dn, fn), 'wb') as out:
                out.write (data)
            del data
           
if __name__ == "__main__":
    main ()

Pretty good progress for now, but for the next few weeks I have to work 50-55 hours, so expect it to be quite slow

Addendum:

The encoding is definitely LZSS, at least the 0x02 mode. By complete serendipity I found out that it follows a similar scheme to the algorithm used in Allegro, where a control byte is issued for every 8 tokens, where a token may be a 1 byte literal or a 2 byte base/length pair. So at most there may be 16 bytes of data between control bytes. I haven't quite got it working though.

Addenum II:

I wrote out a routine to decompress the LZSS stuff as described above. It works for files encoded using Allegro, but only works partially for Agartha, so this seems to confirm to me that Agartha uses a variant. One observation I noticed is that I have to flip the byte order of the base/length pairs to get the proper number of bytes to decode, but I haven't quite managed to exactly get the proper bytes out of it. Single byte literals decompress perfectly, so there's no question about the over all scheme being similar to Allegro. It's just a matter of how to interpret the base index...

Routine for decompression:
C:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define SIZE 4096
#define EXTRA 18
#define MASK (SIZE - 1)

int main (int argc, char **argv)
{
    if (argc < 2)
    {
        return -1;
    }
    /*Load the file into memory*/
    FILE *fp = fopen (argv[1], "rb");
    if (NULL == fp)
    {
        return -1;
    }
    size_t size = 0;
    fseek (fp, 0, SEEK_END);
    size = ftell (fp);
    fseek (fp, 0, SEEK_SET);
  
    char *data = malloc (size);
    fread (data, 1, size, fp);
    fclose (fp);
  
    /*Decompress the contents, writing to stdout*/
    char ring[SIZE + EXTRA - 1];
    unsigned int r = SIZE - EXTRA;
    unsigned int ctl = 0;
    size_t pos = 0;
    int decoded = 0;
  
    memset (ring, 0, sizeof (ring));
    while (pos < size)
    {
        if (0 == (ctl&256))
        {
            if (pos >= size)
            {
                break;
            }
            ctl = data[pos++]&0xff;
            ctl |= 0xff00;
        }
        if (ctl&1)
        {
            putc (data[pos], stdout);
            ring[r&MASK] = data[pos];
            decoded++;
            pos++;
            r++;
        }
        else
        {
            int b0 = data[pos + 0]&0xff;
            int b1 = data[pos + 1]&0xff;
            int base = b0|((b1&0xf0)<<4);
            int length = (b1&0x0f) + 3;
            for (int i = 0; i < length; i++)
            {
                int c = ring[(base + i)&MASK];
                putc (c, stdout);
                ring[r&MASK] = c;
                decoded++;
                r++;
            }
            pos += 2;
        }
        ctl >>= 1;
    }
  
    printf ("\n\ndecoded: %i\n", decoded);
    return 0;
}

If anyone wants to play along feel free! The more help the merrier.
I know you seem to have handled it in your previous code snippets, but could it be an endianness matter?
It's been a while since I've messed around with DC stuff, but I think SH4 is bi-endian and the data on disc is often read in big endian mode.

E.G. when modifying executables you often need to flip bytes around. And only the big endian TOC is considered by the gdrom drive.


C:
#include <byteswap.h>
is probably your friend of you want to test this.

Could you post a test file for me to mess around with as little?
 
Last edited:

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
Could you post a test file for me to mess around with as little?

To the best of my knowledge this is just a text file containing a list of music tracks, the same ones sitting on the disk root. One track per line, windows style line endings. I use this one to test since it's easy to tell when you get something right or wrong.

Unfortunately it seems to be more than just an endian issue, but it's too early to rule it out completely.

As an aside, I want to point out how perfect this algorithm is for the Dreamcast. For those unaware, the GD rom drive is quite slow and painful, and iirc, it's connected to the main cpu over the dainty G1 bus, so bandwidth is at a premium, but more than that, the DC's main cpu has a special feature that lets the programmer turn half the cache into super fast general purpose memory, which is perfect for this algorithm, which only needs a ring buffer and a few variables to keep track of its state. The Agartha team knew what was up!
 
Last edited:

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
Code:
Ress\Music\Demo2001\bodnath_extr08.wav
Ress\Music\Demo2001\bodnath_extr02.wav
Ress\Music\Demo2001\kiangjing_extr08.wav
Ress\Music\Demo2001\gokyo_extr04.wav
Ress\Music\Demo2001\bodnath_extr11.wav
Ress\Music\Demo2001\bodnath_extr12.wav
Ress\Music\Demo2001\bodnath_extr10.wav
Ress\Music\Demo2001\bodnath_extr05.wav
Ress\Music\Demo2001\bodnath_extr06.wav
Ress\Music\Demo2001\bodnath_extr04.wav
Ress\Music\Demo2001\bodnath_extr13.wav
Ress\Music\Demo2001\bodnath_extr14.wav
Ress\Music\Demo2001\bodnath_extr09.wav
Ress\Music\Demo2001\bodnath_extr07.wav
Ress\Music\Demo2001\bodnath.wav
Ress\Music\Demo2001\bodnath_extr01.wav
Ress\Music\Demo2001\bodnath_extr03.wav
Ress\Music\Demo2001\gokyo_extr07.wav
Ress\Music\Demo2001\gokyo_extr06.wav
Ress\Music\Demo2001\gokyo_extr05.wav
Ress\Music\Demo2001\gokyo_extr08.wav
Ress\Music\Demo2001\gokyo.wav
Ress\Music\Demo2001\gokyo_extr01.wav
Ress\Music\Demo2001\gokyo_extr02.wav
Ress\Music\Demo2001\gokyo_extr03.wav
Ress\Music\Demo2001\Gosainkund_mod_niv_short.wav
Ress\Music\Demo2001\kalapathar.wav
Ress\Music\Demo2001\kaligandaki.wav
Ress\Music\Demo2001\kalapathar_extr05.wav
Ress\Music\Demo2001\kanchenjunga_extr11.wav
Ress\Music\Demo2001\kanchenjunga_extr15.wav
Ress\Music\Demo2001\kanchenjunga_extr05.wav
Ress\Music\Demo2001\kanchenjunga_extr09.wav
Ress\Music\Demo2001\kanchenjunga_extr12.wav
Ress\Music\Demo2001\kanchenjunga_extr16.wav
Ress\Music\Demo2001\kanchenjunga_extr06.wav
Ress\Music\Demo2001\kanchenjunga_extr13.wav
Ress\Music\Demo2001\kanchenjunga_extr07.wav
Ress\Music\Demo2001\kanchenjunga_eq.wav
Ress\Music\Demo2001\kanchenjunga_extr10.wav
Ress\Music\Demo2001\kanchenjunga_extr14.wav
Ress\Music\Demo2001\kanchenjunga_extr08.wav
Ress\Music\Demo2001\kaligandaki_extr12.wav
Ress\Music\Demo2001\kaligandaki_extr13.wav
Ress\Music\Demo2001\kaligandaki_extr10.wav
Ress\Music\Demo2001\kaligandaki_extr11.wav
Ress\Music\Demo2001\kaligandaki_extr06.wav
Ress\Music\Demo2001\kaligandaki_extr07.wav
Ress\Music\Demo2001\kaligandaki_extr05.wav
Ress\Music\Demo2001\kaligandaki_extr14.wav
Ress\Music\Demo2001\kaligandaki_extr15.wav
Ress\Music\Demo2001\kaligandaki_extr08.wav
Ress\Music\Demo2001\kaligandaki_extr09.wav
Ress\Music\Demo2001\kalapathar_extr01.wav
Ress\Music\Demo2001\kalapathar_extr02.wav
Ress\Music\Demo2001\kalapathar_extr03.wav
Ress\Music\Demo2001\kalapathar_extr04.wav
Ress\Music\Demo2001\kaligandaki_extr01.wav
Ress\Music\Demo2001\kaligandaki_extr02.wav
Ress\Music\Demo2001\kaligandaki_extr03.wav
Ress\Music\Demo2001\kaligandaki_extr04.wav
Ress\Music\Demo2001\kanchenjunga_extr01.wav
Ress\Music\Demo2001\kanchenjunga_extr02.wav
Ress\Music\Demo2001\kanchenjunga_extr03.wav
Ress\Music\Demo2001\kanchenjunga_extr04.wav
Ress\Music\Demo2001\kiangjing_extr10.wav
Ress\Music\Demo2001\kiangjing.wav
Ress\Music\Demo2001\kiangjing_extr04.wav
Ress\Music\Demo2001\kiangjing_extr11.wav
Ress\Music\Demo2001\kiangjing_extr01.wav
Ress\Music\Demo2001\kiangjing_extr02.wav
Ress\Music\Demo2001\kiangjing_extr03.wav
Ress\Music\Demo2001\kiangjing_extr07.wav
Ress\Music\Demo2001\kiangjing_extr05.wav
Ress\Music\Demo2001\kiangjing_extr06.wav
Ress\Music\Demo2001\kiangjing_extr09.wav
Ress\Music\Demo2001\ladakh.wav
Ress\Music\Demo2001\ladakh_extr03.wav
Ress\Music\Demo2001\ladakh_extr01.wav
Ress\Music\Demo2001\ladakh_extr02.wav
Ress\Music\Demo2001\Patshupatinath_modif_niv.wav
Ress\Music\Demo2001\Pe03_22_extr01.wav
Ress\Music\Demo2001\loupiotV3.wav

😤 I figured it out.

Here's the trick:
the pairs are 16 bits wide, written little endian; the lower 4 bits are the length - 3, and the remaining 12 are the unsigned base. when you decode the pair, subtract base + 1 from the ring buffer position and cache it. this will be your offset, now just read out length bytes from the ring buffer relative to the offset. See code below.

C:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define SIZE 4096
#define MASK (SIZE - 1)

int main (int argc, char **argv)
{
    if (argc < 2)
    {
        return -1;
    }
    /*Load the file into memory*/
    FILE *fp = fopen (argv[1], "rb");
    if (NULL == fp)
    {
        return -1;
    }
    size_t size = 0;
    fseek (fp, 0, SEEK_END);
    size = ftell (fp);
    fseek (fp, 0, SEEK_SET);
    
    char *data = malloc (size);
    if (NULL == data)
    {
        fclose (fp);
        return -1;
    }
    fread (data, 1, size, fp);
    fclose (fp);
    
    /*Decompress the contents, writing to stdout*/
    char ring[SIZE];
    unsigned int r = 0;
    unsigned int ctl = 0;
    size_t pos = 0;
    int decoded = 0;
    
    memset (ring, 0, sizeof (ring));
    while (pos < size)
    {
        if (0 == (ctl&256))
        {
            if (pos >= size)
            {
                break;
            }
            ctl = data[pos++]&0xff;
            ctl |= 0xff00;
        }
        if (ctl&1)
        {
            putc (data[pos], stdout);
            ring[r&MASK] = data[pos];
            decoded++;
            pos++;
            r++;
        }
        else
        {
            int b0 = data[pos + 0]&0xff;
            int b1 = data[pos + 1]&0xff;
            int word = (b1<<8)|b0;
            int base = (word>>4)&0xfff;
            int length = (word&0x0f) + 3;
            int offset = r - (base + 1);
            for (int i = 0; i < length; i++)
            {
                int c = ring[(offset + i)&MASK];
                putc (c, stdout);
                ring[r&MASK] = c;
                decoded++;
                r++;
            }
            pos += 2;
        }
        ctl >>= 1;
    }
    
    printf ("\n\ndecoded: %i\n", decoded);
    return 0;
}

Now to verify if the other 'mode' is really a different algorithm or something. We're going to bust this thing wide open.
 

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
I ran some metrics on the files, and about 60% are encoded in LZSS, 30% in some other encoding - probably LZW - and 10% are uncompressed. Here are a few bits that I've found so far. I've employed the PVR routines that I wrote for Castlevania again, so we can see the textures...

A lot of Agartha appears to be scriptable in plain text, very similar to how Castlevania was. Again, this is probably because DeLoura's book was the hot read back then!

Code:
/*****************************************************************************
     ______________________________________________________________________
     )                                                                       )
    /                        Agartha - Demo2001                              /
   /                            Playable.def                             /
  (_____________________________________________________________________(
  |              )                                                            |
  |    Purpose  /    Palyable demo. Player can walk through Agartha world    |
  |            (    as well as meet some characters...     
  |==========\==========================================================|
  |              )                                                            |
  | Author   /    Sébastien Viannay - NoCliché                            |
  |=========(===========================================================|
  |             \                                                            |
  | History   )    01/03/2001 : Creation.                                    |
  |__________/__________________________________________________________|

*****************************************************************************/

// include the entry point scene for the playabale demo part
INCLUDE    "Village\Village.def"

These book textures seem super interesting, but they're all written in French... @Sega Dreamcast Info care to translate for us? The others attached to the post.

book_pages_00.pvr.png


Who's face is this?
phillipeface.pvr.png

There appears to be quite a few unfinished textures.


serveuse.pvr.png


fossoyeurmalade.pvr.png



chasseur.pvr.png

Loading screens...


loading1.pvr.png


loading3.pvr.png

loading8.pvr.png

More to come, hopefully. I start my work week today, so progress will be slower, unfortunately.
 

Attachments

  • book_pages_00.pvr.png
    book_pages_00.pvr.png
    111.8 KB · Views: 0
  • book_pages_01.pvr.png
    book_pages_01.pvr.png
    114.1 KB · Views: 0
  • book_pages_02.pvr.png
    book_pages_02.pvr.png
    110 KB · Views: 0
  • book_pages_03.pvr.png
    book_pages_03.pvr.png
    118.7 KB · Views: 0
  • book_pages_04.pvr.png
    book_pages_04.pvr.png
    106.9 KB · Views: 0
  • phillipeface.pvr.png
    phillipeface.pvr.png
    31.4 KB · Views: 0
  • book_pages_00.pvr.png
    book_pages_00.pvr.png
    111.8 KB · Views: 0
  • phillipeface.pvr.png
    phillipeface.pvr.png
    31.4 KB · Views: 0
  • serveuse.pvr.png
    serveuse.pvr.png
    103.2 KB · Views: 0
  • chef_i.pvr.png
    chef_i.pvr.png
    108.1 KB · Views: 0

FamilyGuy

2049 Donator
Donator
Registered
Joined
May 31, 2019
Messages
315
Reaction score
309
Points
63
AG User Name
-=FamilyGuy=-
AG Join Date
March 3, 2007
Nice job!

French is my native language, but I don't really have the free time to translate it properly right now. It's some Lovecraftian-inspired lore about Cyclopean pre-Flood beings and the cities they built after the Flood, "Leng" of Lovecraftian fame is named. There's also some mention of the hollow Earth, Agartha seems to be the name of the unfinished city at the core of the earth.

It seems to be the base of the Lore for the game. Nothing too original if you know Lovecraft, but I think it would've been original for a videogame at the time.
 

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
Okay, I figured out the alternate encoding. It was just LZSS again, but with a one line change:

C:
int length = (word&0x0f) + (mode + 1);

The mode comes from the 2 byte flags prefix in the file header. See the (revised) template in the first post.

So now we've cracked open the treasure chest of Agartha... Let's start examing these models and animations next, shall we?
 

dark

Registered
Registered
Joined
Jun 19, 2019
Messages
64
Reaction score
43
Points
18
AG User Name
dark
AG Join Date
September 2, 2011
Cool! You might consider exploring Geist Force as well. I don't remember anyone really diving into the disc contents on that title other than to re-order them for smoother loading.
 

Sifting

Registered
Registered
Joined
Aug 23, 2019
Messages
25
Reaction score
67
Points
13
Cool! You might consider exploring Geist Force as well. I don't remember anyone really diving into the disc contents on that title other than to re-order them for smoother loading.
Really? I would have thought someone pulled it apart given how high profile that release was. I really like geist force, too. No promises though!
 

dark

Registered
Registered
Joined
Jun 19, 2019
Messages
64
Reaction score
43
Points
18
AG User Name
dark
AG Join Date
September 2, 2011
Maybe some people did for their own benefit, but I don't remember anyone at assembler or elsewhere discussing any interesting findings.
I recall someone indicated you could rename the level files on the disc in order to load different levels in "nimai's room" - the mode where you could freely fly around off the fixed rail. I think that allows you to explore a bit more of certain levels, especially the snow level where the fixed rail crashes you into the ground right at the very beginning, though I didn't explore that myself.
 

Wombat

Donator
Donator
Registered
Joined
May 31, 2019
Messages
108
Reaction score
109
Points
43
AG User Name
Wombat
AG Join Date
14-03-2004
@Sifting all this time I have been keeping an eye on this conversation, being completely in awe about what you are accomplishing and discovering. Just leaving this here to say thank you for these awesome reports for both Agharta and Castlevania. Keep up the good work, very excited in what else you will discover.

Also +1 for Darks suggestion, Geist Force for sure is a game which holds many secrets. For one indeed there are multiple playable maps on the disc which by default are not being played. Seeing you really go deep, I think having the source materials untouched is probably best course of action for you. Geist Force unfortunately never got shared untouched, BUT..... luckily I kept the files around from when a select few were trying to figure out how to get the game running on a stock Dreamcast.

Basically this is the whole build with dummied data, but the most important ip.bin and 1ST_READ.BIN can be found in these files fully in-tact. So with these + the build that is floating around you can create an "untouched" image: https://we.tl/t-FR8Z3R2rXh (download valid for 7-days)
 
shape1
shape2
shape3
shape4
shape5
shape6
Top