The Way Forward

It is hard to believe the year is almost done and gone already, it has been busy for me with some life drama earlier in the year then a couple of larger projects keeping me on my toes since then.

During the last few weeks I have taken the time to draft a solid road map for the projects and where I would like them to be by this time next year. The road map evolved in very organic fashion with me jotting down a few points here and there every day, now it is pretty long but very constructive. It is not really something formal enough for me to release in its retarded disorganized point-form fashion but I will touch on a few items.

The biggest community oriented change will be bringing the projects source to a svn/git management system with the usual web interfaces to go with them and allow public bug/contrib additions/tracking. This to some might seem like a really long overdue task but you need to understand that the projects to this point have been developed to cater to my administrative needs day-to-day with work and secondary to the community. Although I will still put priority on features I require, it will become easier for others to submit contributions and assist with bug tracking, hopefully allowing for a more robust set of projects in the end.

The next big ticket item on the road map is the integration of projects into a suite utility which is a long standing desire of mine and others but it requires allot of work. I have decided that it is best to just completely rewrite many of the projects instead of trying to hack them into some half-ass suite utility. The age of APF and BFD is beginning to show, they no longer really compete with some other tools and in certain cases even find themselves dragging behind in performance, features and usability. I intend to modernize the projects by rewriting them in cleaner (read: documented) code with clear project targets that are better outlined from the start along with features that will blow other Linux firewall wrappers and security suites out of the water. I do not want to get into the actual gritty details of specific plans yet but things are definitely going to change in a good way.

On smaller items there is about 40 odd specific items I have put down to paper, that range from feature additions, enhancements, rewrites, bug fixes and contrib additions that need to be worked on either in conjunction with the suite project or before it can even get off the ground. Though these are tedious tasks they are all things that must be done and will get done, I will detail further on these actual items in a future post but for now it is only important to keep in mind that the suite will not beat around the deficiencies of the smaller projects, everything will get some TLC.

I know this post is not that forthcoming in specifics but stay tuned, I will get the road map cleaned up and posted soon. One of the new projects that will become part of the suite will be a malware detection utility primarily for web servers which you can read more about here.

“oops” Wrong Server!

So this past weekend, I did the unthinkable, I accidentally recycled the wrong dedicated server at work. Usually, this is not much of an issue  (not that I make a habit of it) with the continuous data protection we have implemented at the data center (cdp r1soft) except that the backup server this particular client system was using had suffered a catastrophic raid failure the very night before. We have had raid arrays go bust on us before, typically very rare but it does happen… Obviously this resulted in the clients site and databases getting absolutely toasted and having only a static tar.gz cpanel backup available which was over a week old, they were none-too-happy about the loss of the database content.

I have dealt with data loss in the past of various degree’s but never had I dealt with it in the capacity where a format had occurred WITH data being rewritten to the disk. We are also not talking about just a few hundred megs of rewritten data but a complete OS reload along with cpanel install, which comprises multiple gigabytes worth of data and countless software compilations that consist of countless write-delete cycles of data to the disk.

So, the first thing I did on realization of the “incident” was stop everything on the server, remount all file systems read-only then had a “omg wtf” moment. Once I had collected myself I did the usual data loss chore of making a dd image of the disk to a NFS share over our gigabit private network while contemplating my next step. My last big data recovery task was some years ago, perhaps 2 or more years and since I am such a pack rat I still had a custom data recovery gzip on my workstation system that contained a number of tools I used back then. The main ones being testdisk and the sleuth (tsk) tool kit, these tools together are invaluable.

The testdisk tool is designed to recover partition data from a formatted disk and even those that have had minimal data rewrites, which it does exceptionally well. In this case I went in a bit unsure of the success I would have, sure enough though after some poking and prodding of teskdisk options I was able to recover the partition table for the previous system installation. This was an important task as any data that had not been overwritten on the disk instantly became available with the old partition scheme restored, sadly though this did not provide the data I required which was the clients databases. The partitions restored still provided me some metadata to work with and a relative range on the disk of where the data is located, instead of having to ferret over the whole disk. So with that, I created a new dd image of the disk with a more limited scope that comprised the /var partition which effectively cut the amount of unallocated space I needed to search down from 160gb to 40gb.

It was now time to crack out the latest version of the sleuth tool kit and its companion autopsy web application, installed it into shared memory through /dev/shm and then went through the chore of remembering how to use the autopsy webapp. After a few minutes poking around it started to come back to me and before I knew it I was browsing my image files, which is a painfully tedious task in hopes that the metadata can lead you to what your looking for through associative information on file-names to inode relationships. That is really pretty pointless in the end though as ext3 when it deletes data, as I understand, zeros the metadata before completing the unlink process from the journal. I quickly scrapped anything to do with metadata and moved on to generating ASCII string dumps of the images allocated and unallocated space, which allows for quick pattern based searches to find data.

The string dumps took a couple of hours to complete generating, after which I was able to keyword/regexp search the disks contents with relative ease (do not try searching large images without the string dumps, it is absurdly slow). I began some string searches looking for the backup SQL dumps that had been taken less than 24h ago during the weekend backups, although I did eventually find these dumps it turned out some of them were so large they spanned non-sequential parts of the disk. This made my job very difficult as it then became a matter of trying to string together various chunks of an SQL dump which I had no real knowledge of the underlying db structure for. After many hours of effort and some hit-or-miss results, I managed to recover a smaller database for the client which in the end turned out to be absolutely useless to them, that was it for the night for me – I needed sleep.

Sunday morning brought about an individual from the clients organization who was familiar with the database structure to the custom web application they have and was able to give me exact table names they needed recovered – which was exactly what I needed. I was then able to craft some regexp queries that found all insert, update and structure definitions for each of the tables they required and despite some of the parts of these tables being spread across the disk – knowing what they needed allowed my regexp queries to be accurate and provide me the locations of all the data. Now that I had all the locations it was just a task of browsing the data, modifying the fragment range of the data I was browsing so that it included the beginning and end of the data elements followed by exporting the data into notepad where I reconstructed the sql dumps to what turned out to be a very consistent state. This did take a little while though but was not near as painful a process as my efforts  from the night before, so I was very happy with where we had ended up.

A couple hours after I turned the data over to the client they were restoring tables they much needed to get back online, this was followed by ping from the client on AIM that they had successfully restored all data and were back online in a state near identicle to just before they went offline. What the client took from this is to never trust anyone else with safe guarding there data and they intend to keep regular backups of there own data now in addition to the backups we retain, which is a very sensible practice to say the least.

New Site, At Last!

It has been on my plate for a long time now to redo the R-fx Networks site, although this process began some years ago with a few incarnations of new sites developing behind the scenes, none ever made it into production. In the end I drew the conclusion that sometimes simpler is better, so here we have it – the new R-fx Networks site – devoted to the projects and my personal work as a whole.

Where I want to go with this new site is explained a bit in the about us section, so head on over there if you have not already read it.

To elaborate though on some of the specifics, one of the larger tasks ahead is converting what documentation there is on the projects into an online format outside that of the older README files. This will allow for much easier management of the documentation on my part along with search capabilities through the site to find whatever you may need to know for a particular project. This task is not without hurdles, the real issue is that most of the projects are severely lacking in the way of consistent and complete documentation, so I have my work cut out for me.

While the documentation work is being done I will also be looking to fill the site with content relating to all things Linux, Networking and more. For instance, an index of reference links to off site articles relating to the projects will be created to assist new and older users alike with articles that may interest them in how others utilize our projects, customize them or just plain old installation guides. You will also on occasion have to put up with my own personal blogs, which I will try to keep on point with the projects, my job, industry interests or something that has a semblance of relevance (don’t worry no twitter type stuff here).

Finally, the biggest task of all through the end of this year will be another thing that has been on my todo list for a long time, that is the unification of the projects into a singular application, a suite program if you will. All the projects share the common goal of improving the integrity and security of Linux servers and with that it makes sense that they should share a common installation and management suite. This task though is one that requires allot of planning in addition to fundamental changes in how some of the projects operate and even complete rewrites of others, so please be patient and I will detail more on this later.

I will wrap up there and hope that you find the new site, as it evolves, both useful and informative.