Print Page - C# DBPF unpacking

Title: C# DBPF unpacking
Post by: Afro on 2009 June 28, 21:28:09

Anyone have any idea how to implement the DBPF compression in C#?
I've been trying to convert the PHP version, but it just... keeps giving me -alot- of errors.

Code:

        /// <summary>
        /// Extracts and uncompresses, when neccessary, all the loaded entries in an archive
        /// to the specified path. Assumes that LoadArchive() has been called first.
        /// </summary>
        /// <param name="Path">The path to extract to.</param>
        public void ExtractFiles(string Path)
        {
            BinaryReader Reader = new BinaryReader(File.Open(m_ArchiveName, FileMode.Open));

            foreach (DBPFEntry Entry in m_Entries)
            {
                Reader.BaseStream.Seek(Entry.FileOffset, SeekOrigin.Begin);

                if (Entry.Compressed)
                {
                    Entry.FileData = new byte[Entry.DecompressedSize];
                    Decompress(Reader, Entry);
                }
            }
        }

        private void Decompress(BinaryReader Reader, DBPFEntry Entry)
        {
            uint CompressedSize = Reader.ReadUInt32();
            ushort CompressionID = Reader.ReadUInt16();
            Reader.ReadBytes(3); //Uncompressed size of file...

            int NumPlainChars = 0;
            int NumCopy = 0;
            int Offset = 0;

            string Answer = "";

            if (CompressionID == 0xFB10 || CompressionID == 0xFB50)
            {
                uint Length = Entry.FileSize;

                for (; Length > 0;)
                {
                    //Control character
                    byte CC = Reader.ReadByte();
                    Length -= 1;

                    if (CC >= 252) //0xFC
                    {
                        NumPlainChars = CC & 0x03;
                        if (NumPlainChars > Length)
                            NumPlainChars = (int)Length;

                        NumCopy = 0;
                        Offset = 0;
                    }
                    else if (CC >= 224) //0xE0
                    {
                        NumPlainChars = (CC - 0xDF) << 2;
                        NumCopy = 0;
                        Offset = 0;
                    }
                    else if (CC >= 192) //0xC0
                    {
                        Length -= 3;

                        char Byte1 = Reader.ReadChar();
                        char Byte2 = Reader.ReadChar();
                        char Byte3 = Reader.ReadChar();

                        NumPlainChars = CC & 0x03;
                        NumCopy = ((CC & 0x0C) << 6) + 5 + Byte3;
                        Offset = ((CC & 0x10) << 12 ) + (Byte1 << 8) + Byte2;
                    }
                    else if (CC >= 128) //0x80
                    {
                        Length -= 2;

                        char Byte1 = Reader.ReadChar();
                        char Byte2 = Reader.ReadChar();

                        NumPlainChars = (Byte1 & 0xC0) >> 6;
                        NumCopy = (CC & 0x3F) + 4;
                        Offset = ((Byte1 & 0x3F) << 8) + Byte2;
                    }
                    else
                    {
                        Length -= 1;

                        char Byte1 = Reader.ReadChar();

                        NumPlainChars = (CC & 0x03);
                        NumCopy = ((CC & 0x1C) >> 2) + 3;
                        Offset = ((CC & 0x60) << 3) + Byte1;
                    }

                    if (NumPlainChars > 0)
                    {
                        string Tmp = new string(Reader.ReadChars(NumPlainChars));
                        Answer += Tmp;
                    }

                    int FromOffset = Answer.Length - (Offset + 1);

                    for (int i = 0; i < NumCopy; i++)
                    {
                        if(((FromOffset + i) < Answer.Length) && ((FromOffset + i) > 0))
                            Answer = Answer.Substring(FromOffset + i, 1);
                    }
                }
            }
            else
                return;
        }

That's my current implementation, up there.
Even with these checks:

if(((FromOffset + i) < Answer.Length) && ((FromOffset + i) > 0))

It still fails on this line:

string Tmp = new string(Reader.ReadChars(NumPlainChars));

Saying: 'The output char buffer is too small to contain the decoded characters, encoding 'Unicode (UTF-8)' fallback 'System.Text.DecoderReplacementFallback'.'

Sorry if this is posted in the wrong forum, but I don't seem to have permission to post in the 'Bowels' forums.
I've peaked at the SimPE source, but there's too many classes and it is generally a big mess.

Title: Re: C# DBPF unpacking
Post by: J. M. Pescado on 2009 June 29, 02:57:11

You could try implementing it based on the Ricktool C# ones at http://gib.me, but the rest of the community seems to be using somewhat more crapulent algorithms than the one we have in the original TS2 dbpfrc.

Title: Re: C# DBPF unpacking
Post by: Afro on 2009 June 29, 06:14:52

Thanks!
But... exactly what tool are you referring to? ???
There seems to be alot of sourcecode in his SVN, but none that's Sims related.

Edit:

I tried using zlib's deflate() decompression, by way of SharpZipLib, but I keep getting 'Unknown block' type errors, even when I skip the 9 byte header. :\

Code:

        private byte[] Deflate(DBPFEntry Entry)
        {
            Stream S = new InflaterInputStream(new MemoryStream(Entry.FileData), new ICSharpCode.SharpZipLib.Zip.Compression.Inflater(true));
            MemoryStream MemStream = new MemoryStream();

            int SizeRead = 0;
            byte[] Buffer = new byte[2048];

            while (true)
            {
                SizeRead = S.Read(Buffer, 0, 2048);
                if (SizeRead > 0)
                    MemStream.Write(Buffer, 0, 2048);
                else
                    break;
            }

            return MemStream.ToArray();
        }

Title: Re: C# DBPF unpacking
Post by: J. M. Pescado on 2009 June 30, 13:02:34

http://www.moreawesomethanyou.com/smf/index.php/topic,8279.0.html

This tool here. It is for TS2 and requires modification to make it work for TS3, but the compression is sound and is what is used in s3rc.

Title: Re: C# DBPF unpacking
Post by: Afro on 2009 June 30, 13:18:44

That's a C++ tool and the code is generally (no offense to whoever wrote it) kinda cryptic.

Anyways... I don't know if anyone are very good with C# here, but I changed my code kinda drastically:

Code:

        /// <summary>
        /// Extracts and uncompresses, when neccessary, all the loaded entries in an archive
        /// to the specified path. Assumes that LoadArchive() has been called first.
        /// </summary>
        /// <param name="Path">The path to extract to.</param>
        public void ExtractFiles(string Path)
        {
            BinaryReader Reader = new BinaryReader(File.Open(m_ArchiveName, FileMode.Open));

            for(int i = 0; i < m_Entries.Count; i++)
            {
                Reader.BaseStream.Seek(m_Entries[i].FileOffset, SeekOrigin.Begin);

                if (m_Entries[i].Compressed)
                {
                    m_Entries[i].FileData = new byte[m_Entries[i].DecompressedSize];

                    Reader.BaseStream.Seek(m_Entries[i].FileOffset, SeekOrigin.Begin);
                    m_Entries[i].FileData = Reader.ReadBytes((int)m_Entries[i].FileSize);

                    m_Entries[i] = Decompress(new BinaryReader(new MemoryStream(m_Entries[i].FileData)), 
                        m_Entries[i]);

                    if (m_Entries[i] != null)
                    {
                        BinaryWriter Writer = new BinaryWriter(File.Create(Path + m_Entries[i].InstanceID.ToString() + "." + m_Entries[i].GetFileExtension()));
                        Writer.Write(m_Entries[i].FileData);
                        Writer.Close();
                    }
                }
            }
        }

        private DBPFEntry Decompress(BinaryReader Reader, DBPFEntry Entry)
        {
            uint CompressedSize = Reader.ReadUInt32();
            ushort CompressionID = Reader.ReadUInt16();
            Reader.ReadBytes(3); //Uncompressed size of file...

            int NumPlainChars = 0;
            int NumCopy = 0;
            int Offset = 0;

            MemoryStream Answer = new MemoryStream();
            BinaryWriter Writer = new BinaryWriter(Answer);

            if (CompressionID == 0xFB10 || CompressionID == 0xFB50)
            {
                bool Stop = false;

                while(!Stop)
                {
                    //Control character
                    byte CC = Reader.ReadByte();

                    if (CC >= 252) //0xFC
                    {
                        NumPlainChars = CC & 0x03;
                        if (NumPlainChars > Reader.BaseStream.Length)
                            NumPlainChars = (int)Reader.BaseStream.Length;

                        NumCopy = 0;
                        Offset = 0;

                        Stop = true;
                    }
                    else if (CC >= 224) //0xE0
                    {
                        NumPlainChars = (CC - 0xDF) << 2;
                        NumCopy = 0;
                        Offset = 0;
                    }
                    else if (CC >= 192) //0xC0
                    {
                        byte Byte1 = Reader.ReadByte();
                        byte Byte2 = Reader.ReadByte();
                        byte Byte3 = Reader.ReadByte();

                        NumPlainChars = CC & 0x03;
                        NumCopy = ((CC & 0x0C) << 6) + 5 + Byte3;
                        Offset = ((CC & 0x10) << 12 ) + (Byte1 << 8) + Byte2;
                    }
                    else if (CC >= 128) //0x80
                    {
                        byte Byte1 = Reader.ReadByte();
                        byte Byte2 = Reader.ReadByte();

                        NumPlainChars = (Byte1 & 0xC0) >> 6;
                        NumCopy = (CC & 0x3F) + 4;
                        Offset = ((Byte1 & 0x3F) << 8) + Byte2;
                    }
                    else
                    {
                        byte Byte1 = Reader.ReadByte();

                        NumPlainChars = (CC & 0x03);
                        NumCopy = ((CC & 0x1C) >> 2) + 3;
                        Offset = ((CC & 0x60) << 3) + Byte1;
                    }

                    if (NumPlainChars > 0)
                        Writer.Write(Reader.ReadBytes(NumPlainChars));

                    long FromOffset = Answer.Length - (Offset + 1);

                    for (int i = 0; i < NumCopy; i++)
                    {
                        //Answer += Answer.Substring(FromOffset + i, 1);
                        Writer.Write(BinarySubstring(Reader, FromOffset + i, 1));
                    }
                }

                Entry.FileData = Answer.ToArray();
                Writer.Close();
                Reader.Close();
            }
            else
                return null;

            return Entry;
        }

        private byte[] BinarySubstring(BinaryReader Reader, long Offset, int NumBytes)
        {
            long CurrentOffset = Reader.BaseStream.Position;
            Reader.BaseStream.Seek(Offset, SeekOrigin.End);
            byte[] Data = Reader.ReadBytes(NumBytes);
            Reader.BaseStream.Seek(CurrentOffset, SeekOrigin.Begin);

            return Data;
        }

I actually managed to inflate the data to 17megs at one point with this code (the testing file is only 13 megs), but characters were copied after each other in long sequences (I.E the inflation didn't work properly). The main problem with this code is that the Offset (parameter for the BinarySubstring() method) consistently wants the reader to seek to a position before the beginning of the stream, even when I seek from the end of the stream!
Where am I supposed to be searching FROM (Beginning, Current position or End of the stream), and why is it trying to seek to a position BEFORE the beginning of the stream?

Title: Re: C# DBPF unpacking
Post by: J. M. Pescado on 2009 June 30, 13:58:24

Quote from: Afro on 2009 June 30, 13:18:44

That's a C++ tool and the code is generally (no offense to whoever wrote it) kinda cryptic.

It's compression code. Of course it's cryptic. Compression code is basically utterly incomprehensible to anyone who doesn't have an extensive background in compression algorithms, and the best anyone else can really do is take it and use it as-is. In this case, it functions effectively as a black box of sorts in with a clear input and output end. That's as much as you can ask for.

Quote from: Afro on 2009 June 30, 13:18:44

Anyways... I don't know if anyone are very good with C# here, but I changed my code kinda drastically:

Why do you have this fixation on that file Microsoftian abomination, anyway? Use a real programming language!

Title: Re: C# DBPF unpacking
Post by: Afro on 2009 June 30, 14:52:52

Heh, after digging around more in the SimPE source, I found the C# Decompression function I had been longing for.
And no, it turns out that the problem is not, in fact, that I'm coding in C#, but that the DBPF format is extremely quirky and cumbersome. I probably overlooked some details when reading specification(s), but anyways... here's what I found:

1. All files in the archive will be listed in the archive's DIR resource even if they are not compressed. At least so long as any file in the archive is compressed.
2. The uncompressed size listed for an entry in the archive's DIR resource consistantly seems to be wrong, meaning;
3. ... to find out if a file is actually compressed, you have to read it's compression header, get the uncompressed size from there, and then check if it's smaller than or the same as the [compressed] size. If it isn't, it means the file is compressed.
4. Consequently, every single file (except the DIR file, obviously) in a compressed archive will have a compression header even if the file is not compressed itself.

Note that all of the above only applies to archives that have compressed files in them, and I haven't tested my code on an uncompressed archive yet. The rules might be slightly or even totally different for said archives.

Title: Re: C# DBPF unpacking
Post by: J. M. Pescado on 2009 July 01, 02:53:38

The version in the SimPE source performs extremely badly, often failing to successfully compress many things, and it is not recommended that you use it.

Title: Re: C# DBPF unpacking
Post by: Afro on 2009 July 03, 15:48:21

Thanks for letting me know!
Right now being able to compress things is not my greatest concern, as I am simply trying to gain access to the data in the The Sims Online DBPF archives. How do I have those archives?
I ordered a used version of The Sims Online off of eBay, because I had deleted EA-land off my harddrive.
I am now so sick and tired of nobody having written a server emulator for this game, I am intending so see about recreating the client.

So far I have partial support for iff files (still working on getting DGRP resources to display properly), full support for far archives and full support for DBPF archives (assuming TSO uses the same compression as Sim City 4, which... I hope it does).

And yes... I realize you need a server as well, but I do not think writing a custom server for TSO is going to be extremely hard, as it was never the most bandwidth intensive game. In fact, any game that can be run through HTTP with SSL encryption (which seems to be how most of the original protocol was implemented -- I did some packet sniffing while it was still online) is... not very bandwidth intensive at all.

Title: Re: C# DBPF unpacking
Post by: J. M. Pescado on 2009 July 03, 16:24:25

...the Sims Online is TS1-based. I'm not even sure they *USE* DBPF archives, and almost certainly not for this. Information on TS1 barely survives now, really. I recall people have had some success ripping the objects for use in TS1, but frankly at this juncture, it's not clear why it would be desirable to resurrect TS1...online. In any case, we're a TS2 and TS3 site, TS1 predates this site and we have no real information or interests in it.

Quote from: Afro on 2009 July 03, 15:48:21

I am now so sick and tired of nobody having written a server emulator for this game, I am intending so see about recreating the client.

You mean the server? The client is just TS1.

Quote from: Afro on 2009 July 03, 15:48:21

And yes... I realize you need a server as well, but I do not think writing a custom server for TSO is going to be extremely hard, as it was never the most bandwidth intensive game. In fact, any game that can be run through HTTP with SSL encryption (which seems to be how most of the original protocol was implemented -- I did some packet sniffing while it was still online) is... not very bandwidth intensive at all.

Well, if you don't have the details on the original protocol spec AND you don't have an operational client, it seems like you are reinventing the wheel from scratch. What exactly do you hope to accomplish by this? It seems to me that you woudl be better off discarding hokey EAxian formats and rewriting a game from scratch. There's certainly nothing GOOD to emulate in TSO.

Title: Re: C# DBPF unpacking
Post by: Afro on 2009 July 04, 18:49:25

According to my research (which has been extensive), The Sims Online was the first game where the DBPF format was used to store (most of?) the game's data. I actually... am fairly sure I tried to extract the data while the game was still online, but I didn't get it to work.
Now that I have a decompression routine that works, all I have to do is modify it to work with Sim City 4 (which presumably contains the earliest version of the decompression algorithm available, which was a decendant from The Sims Online), and it'll hopefully work. If not I'm going to have to open the archives in a hex viewer and see if I can get some more information.
And yes, the idea is to rewrite the client and server from scratch, but retaining compatibility for the old gamedata so it won't have to be remade. Considering TSO hasn't been online for about two years, EA aren't making and money from it and haven't for a long time, so I'm hoping they won't be offended if I re-release the client (which was, incidentally, freely available for download the last year or so the game was online under the name 'EA-Land').

Title: Re: C# DBPF unpacking
Post by: Inge on 2009 July 06, 08:11:25

Or have a look at http://www.simlogical.com/Sims3ToolsForum/index.php?board=6.0 which has compression and source code you can look at.

Title: Re: C# DBPF unpacking
Post by: Afro on 2009 July 12, 14:05:07

Thanks guys!
Finally got my copy of TSO in the mail, and... DBPF isn't the largest problem. In fact, it hasn't been a problem at all. TSO's DBPF archives aren't compressed. They contain directories that I haven't been able to figure out, but I'm still able to extract the files.
No, the biggest challenge as far as TSO's data is concerned is the new FAR (File ARchive) format, originally used by The Sims (1). It seems most of TSO's data is stored in those archives, and I haven't been able to figure 'em out yet. They don't seem to be substantially different from the original FAR archives, except that they seem to support compression (hopefully the same kind of RefPack compression used by SimCity 4's DBPF archives). Here is my preliminary writeup:

Code:

Version 3

The Sims Online (TSO) introduces a new version of the FAR format. This format is FAR, version 3. This format has not been completely reversed yet, so most of the details below are not set in stone.

Header

    * Signature - An eight byte string, consisting of the characters 'FAR!byAZ' (without quotes).
    * Version - A byte signifying the version of the archive. Should be 3.
    * Unknown - Three bytes of 0.
    * Manifest offset - a 4 byte integer specifying the offset to manifest of the archive (from the beginning of the file), where offsets to every entry are kept.

Manifest

    * Number of files - A 4 byte integer specifying the number of files in the archive.
    * Manifest Entries - As many manifest entries as stated by the previous integer]

Manifest entry

    * Raw Size - The uncompressed size of the filedata, stored as a UInt32 (4 bytes).
    * Compressed Size - The compressed size of the file. FAR V. 3 seems to support compression. Will be the same as the first field if the file is not compressed. This seems to be a UInt16 (2 bytes).
    * Offset - The offset of the filedata in the file. Could possibly be a UInt32, but seems to be a UInt16 (2 bytes).
    * Unknown - 17 bytes of an unknown purpose.
    * Filename - A string representing the filename of the file(data). Seems to be null-terminated.

If anyone wants to help out in documenting this format, here (http://www.savefile.com/files/2151681) is the link to the uploaded archive that I'm currently working on. I also have a Wiki here (http://afr0games.com/simswiki/index.php?title=Far) that I use to document all of the original formats from the original Sims games (mostly TSO though). Feel free to use info from this Wiki as you like.

Title: Re: C# DBPF unpacking
Post by: J. M. Pescado on 2009 July 12, 14:07:42

This really falls outside the scope of this board, as we simply do not deal in TS1/TSO materials here, but good luck with it.

Title: Re: C# DBPF unpacking
Post by: Afro on 2009 July 15, 22:28:10

Ok, another update:

I seem to have been able to extract the files now, but they seem to be compressed with a type of QFS/Refpack compression that has a 18 byte header. It should be the same type of compression used on Sim City 4. Does anyone know anything about this? Usually the header should be 9 bytes long, but decompressing the files seem completely impossible using normal QFS decompression, even when modified for SC4.
I've never heard of an 18 byte QFS header before. :(

Title: Re: C# DBPF unpacking
Post by: J. M. Pescado on 2009 July 16, 01:26:28

Not really, no. Like I said, this is a Sims 2 and 3 board. You should really a find a Sims 1 board for this.

More Awesome Than You!

The Bowels of Trogdor => The Large Intestines of Trogdor => Topic started by: Afro on 2009 June 28, 21:28:09