The roads I take...

KaiRo's weBlog

October 2024
123456
78910111213
14151617181920
21222324252627
28293031

Displaying recent entries in English and tagged with "preservation". Back to all recent entries

Popular tags: Mozilla, SeaMonkey, L10n, Status, Firefox

Used languages: English, German

Archives:

July 2023

February 2022

March 2021

more...

February 27th, 2014

Preserving Software: Emulators

It's been a while since I wrote a post here, and even longer since I wrote about preserving software. But there's two more topics I have on my list to write about the event I attended last May. This is one of them.

One problem for preserving software is that the original hardware that the software did run on might not survive very long. Some people are still keeping some old machines like C64, Apple ][ and others running, but at some point there won't be many left as the original ones wear out or get damaged, and other hardware might not be usable at any more already at this point. And for sure, those machines are not available broadly to the public. Ideally, we'd have the hardware and recreate the full experience, e.g. how you connected the machine to your own TV in the living room and played or worked with it there - but that is pretty unlikely or at least hard to do, esp. with the hardware being less and less available, as I mentioned.

But there's one way to bring at least part of the experience to users: We can emulate the old machines and let the preserved software run within that emulator. That doesn't give us the living-room-TV experience, but there's a better chance in both preserving that way of running the old pieces of software for a long time and making the experience broadly available. Now, it's not always easy to get emulators running well, but there are a number of projects out there, and we heard about a few interesting solutions in the preserving software event at the LoC, but one was particularly appealing to us as Mozillians.



I blogged about The Internet Archive (archive.org) and Jason Scott already some time ago, and he was it that mentioned this very appealing kind of emulator called JSMESS. What hides behind that name is the multi-platform MESS emulator, cross-compiled into JavaScript via EmScripten, a project that should be well-known here at Mozilla. :)



Since the event in May, a lot of work has been flowing into JSMESS, and as Jason has blogged about, there are a thousand cartriges available now in the Historical Software Collection of The Internet Archive, and performance is pretty decent within the browser now.

With that, a whole lot of old software is available for everyone, at any time, to try and experience within their own browser!

That's a powerful way to preserve software for the current world and upcoming generations, isn't it?

By KaiRo, at 02:40 | Tags: history, Mozilla, preservation, software | no comments | TrackBack: 0

November 7th, 2013

Internet Archive Fire: Donate to Rebuild

I just got word that a fire destroyed the Internet Archive Scanning Center in San Francisco.



I have blogged about what the archive has and can do a few months ago and I probably will mention it again when I get to more posts on preserving software.

I think it's in the best interest of everyone, esp. us as Mozillians, to keep this organization going and make the history of the Internet and more openly available to current and future generations.

Please help them to rebuild and continue on their way and make a Donation. I will for sure.

By KaiRo, at 19:04 | Tags: history, Mozilla, preservation, software | no comments | TrackBack: 0

August 2nd, 2013

Preserving Software: The Internet Archive

One presentation I found particularly interesting on the Software Preservation Summit at the Library of Congress was by Jason Scott of The Internet Archive (archive.org).

Jason talked about multiple efforts he's involved in, including his early (and ongoing) work on textfiles.com, collecting writing from the time when people first got online, and some other initiatives I'll mention at the end of this blog, but the main focus was on The Internet Archive, the non-profit he is working for nowadays and which has public collections of historical digital content as its main mission.

Image No. 23153

The site and organization are probably best known for the Wayback Machine, which has archived "over 240 billion web pages" going back more then 15 years, see e.g. a Mozilla homepage from around the time when I first encountered the project. But next to that, they have tons of other digital content archived - video, audio, texts, and more. Jason said they are basically seeking to store everything available in digital format that could be of any historical use at some point - preferably first making sure it's store and worrying about legal questions only as they arise, as it's better to have something but take it down than to be able to publish it but not having lost it to history. He went as far as to say they want to be "the hard drive of the Internet" and store everything anyone gives to them, be it personal documents, software that was published at some point, or other digital content. For example their software collection contains collections of entire FTP servers of the past as well as CD images and terabytes (!) of software and firmware for old systems to run in emulators.

And there's an "Upload" button on the site as well, inviting me, you, and everyone else to contribute content that they can archive!
So, if you have old digital content lying around, go to archive.org and make it available to the public, including the kids of the future, before it gets covered with enough dust or otherwise degrade in a way that the media can't be cleanly read any more.

If you have really important pieces of history that are on media that you fear is too dusty and old to still be read cleanly, or where it's hard to find any drive to read that media any more, or you know of such things that might otherwise be hard to recover, you might be interested in another project that Justin Scott is involved in: the Archive Team. That group is dedicated to rescue old digital contents where it's not easy, and to save history before it's actually lost. They have specialized equipment to read even aged disks and tapes, and they are building up communities to save sites before they die - they even archived most of Geocities before it died!
A quite awesome story is also how they helped to recover the original "Prince of Persia" source code.

And then, there's one more of Jason's projects that Mozilla folks will probably like: JAVASCRIPT MESS!

Jason used EmScripten to port the MESS emulator into JavaScript and run it from a browser. Yes, you can run Atari 2600 or Sega Genesis games in the browser! This is only a beta right now, but it shows how the "browser" (or should I say "web runtime"?) can help us enable to make software history available to future generations!

All those projects can profit from your help, so if you have anything you can contribute, please do so! :)

By KaiRo, at 03:02 | Tags: history, Mozilla, preservation, software | 1 comment | TrackBack: 0

July 18th, 2013

Preserving Software: Artifacts and Metadata

One thing I found interesting on the software preservation summit was that some collectors told us that people investigating preserved software, e.g. for university studies or for museum exhibits, are often not interested in getting the software itself from the collectors they contact, as very often they already could get that via other channels, esp. when it's software that had been wide-spread at one point - a often-mentioned example that apparently is the corner-stone of all software preservation efforts is DOOM. ;-)

What many people of those writing works on preserved software or museums doing exhibits on it do want from collectors in those cases is artifacts, or if you want "meta-materials", around the software itself - packaging, guides, brochures, ads, posters, magazine reviews, and whatnot. With those pieces, any papers or exhibits on the software becomes way more interesting and can also deliver some of the culture around the software.

And that made me wonder somewhat - I know we are preserving all binaries we ever shipped and all code at Mozilla, even our website code, but how much of physical objects related to our software are we preserving? Well, we don't have packaging, but we had CDs for some stuff (I remember one for Mozilla 1.0), we did T-shirts, stickers, etc. - and there's surely magazine articles, the NY Times ad, and similar items. What of all that do we still have preserved? Do we have some kind of archive at Mozilla for that?

Here's a part of my "personal collection" of Mozilla artifacts:
Image No. 23152
I hope we have a better collection of those things somewhere at Mozilla headquarters or so. ;-)

A larger problem for preservation is if you want to preserve the environment and culture that the software was running in, e.g. how it was when you connected the C64 to the TV in your family's home, or even when you ran Altavista (which just has been shut down) for Internet search. At this level, preserving, reproducing or even emulating the environment and experience of older software is becoming really hard - but an interesting challenge esp. for museums trying to educate new generations about our history.

Another, connected, topic is metadata of the software itself - from product names/versions and writers/vendors via info on installation media/packages to file names, checksums and settings of the installed software there is a lot of metadata one can collect along with the preserved binaries and/or code.

For example, NIST's National Software Reference Library (NSRL) - see also this interview by the LoC - is collecting a lot of information about the installed software, and also what it leaves behind when uninstalled (as their original cause is to help the FBI find out what was installed on investigated computers).
And this metadata collection might actually provide us with an opportunity: Knowing the names and checksums of libraries installed with valid software can help us identify at least some of the libraries we see correlated with crashes. For that reason, we recently did get the Dragnet tool online that is intended to help us there, and it would be great if metadata from NSRL or similar efforts can be connected to that and help us in our own investigations there.
So, here's a way that software preservation efforts can directly play back into our current work on understanding current software and improving future releases of Firefox!

By KaiRo, at 02:49 | Tags: history, Mozilla, preservation, software | no comments | TrackBack: 0

June 20th, 2013

Preserving Software: Museums, Archives, Libraries

As I mentioned before, I attended an event on preserving software at the US Library of Congress last month. Jon Ippolito from the University of Maine wrote up a great summary of who was there and what we discussed, so I won't go into those details and leave you with his words on that.

Instead, I'll do multiple short posts on my impressions and thoughts of the event and the subject, probably over the next few weeks.

The attendance consistent mostly of people from the existing software preservation community in the US, the majority of those people knew (of) each other already, apparently. In addition, we had some people from the software creation community - Microsoft's (sole) archivist probably belongs to both the preservation and software communities, then we had a guy from GitHub, and finally, Otto and me from Mozilla.

One thing that I learned with regard to the preservation community is that there are basically three types of projects they operate: museums, archives, and libraries.

Museums only collect a small collection of large milestones in history, but try to get as much on those as possible so they can build up a great exhibit for the public to learn about our and their past.
Archives build up large collections of items with the main intent of preserving them as ideally as possible and usually without any intent to provide them to the public, the items are only available to sporadic researchers. There may be metadata collected on the items that may be available to a larger public, though.
Libraries are somewhat in between: They build up larger collections of items and try to preserve them, but with the intent of some public to have regular access to them, often in a very controlled manner, e.g. via reading rooms.

On this software preservation summit, we had a number of representatives of all three kinds of projects: Museums such as the Computer History Museum, the Museum of Modern Art or the MIT Museum, archives such as Microsoft's, NIST's NSRL (National Software Reference Library - yes, "Library" is a bit of a misnomer there) or the Internet Archive, and libraries such as the Astrophysics Source Code Library, university libraries or, of course, the Library of Congress.

In terms of software preservation, we found that those different organizations and those doing different kinds of collections, can not just learn from each other, they can also help each other: Not every one of them wants every piece of software coming in, depending on what exactly they collect, so it may make sense to forward some pieces to other projects.

It was interesting for us as outsiders to the preservation community to see what those people are doing and how they are organized. In future posts, I'll get more into how and where we as software producers can work with them.

By KaiRo, at 02:22 | Tags: history, Mozilla, preservation, software | 2 comments | TrackBack: 0

May 17th, 2013

Preserving Software - Feedback Requested!

As Digital Preservation is part of the agenda of the US Library of Congress, they're doing a workshop on Software Preservation next week, and Mozilla was invited as an expert group. Otto de Voogd and myself are in the delegation going there (I'll be roughly in the Washington, DC, area from Saturday until June 2) for Mozilla - and the text below is a guest post by Otto with questions that we would like some feedback on so we can represent the Mozilla community as well as possible:




On the 20th and 21st of May the Library of Congress holds a workshop on the topic of preserving software.
Otto de Voogd and Robert Kaiser will be representing Mozilla, putting forward our viewpoint as custodians of a codebase with a significant heritage and importance.

Many questions and thoughts arise. Here's an overview of ours; we look forward to feedback.


- Should archivists keep source codes or executables or both?

Executables and source code are both valuable. Executables are valuable because the source code is sometimes not available, or perhaps the build tools are not, and setting up a build environment for older code can be a difficult and complex thing.

Source is valuable to determine how a program works. It also makes it possible to reuse code and algorithms, especially, but not only, in the case of open source software.


- Preserving documentation.

Preserving documentation that goes with software, seems logical.
Would this need to go as far as preserving discussion threads and entries in bug trackers?


- Preserving environments/platforms.

It seems obvious that without preserving an environment in which the software can run, it is going to be impossible to experience the software.
Preserving such an environment should therefor be part of the software preservation effort.

To avoid the physical constraints imposed by preserving old hardware (which would be a preservation effort in its own right), a solution would be to build virtual machines and emulators.
As hardware capacity constantly grows, running virtual versions of older hardware should generally be feasible.

To fully recreate an environment we'd also need to preserve the operating systems and other software tools that the preserved software needs to run.
Those being software themselves would logical already be included in any software preservation effort.

Preserving documentation concerning environments, would also be required.
To build virtual machines and emulators it would be helpful for hardware makers to make technical specifications available. One could envision this to become a legal requirement at least for older hardware.

Can we imagine a world where web based emulators would allow an online digital library to serve users worldwide? Users who would be able to run old software in emulators running in their browsers...


- Is everything worth preserving, if not how does one go about selecting what is worth preserving?

Does one need to preserve every version of software, just the last version or all major releases? What about preserving software that has not spread widely. Would there be some threshold, or some other criteria?


- How does one index software and search the library?

There will be a need to gather meta data about software and the preservation of documentation as we already mentioned. This meta data and documentation could serve to populate an index enabling for instance the search for particular features.


- Can software preservation help in making code reusable?

If there are good ways to actually find relevant and useful code, this could lead to more reuse not only of actual code, but also of algorithms and concepts.
It may also become a valuable source for students who wish to learn about actual implementations of software solutions.

At the very least a minimum of meta data, such publication dates, copyright owners and licenses should be available to determine how certain code can be reused.
In particular for open source software we believe that software libraries should strive make it available without restrictions.


- Preserving data formats.

The software preservation effort should also include an effort to preserve data formats. Including technical descriptions of those formats and the tools to read, write and edit those formats.


- Can software preservation help in the discovery of prior art?

We believe it can, and as such preserving old code could be a great tool in preventing the repatenting of existing software concepts.

Of course we believe that software patents shouldn't exist in the first place, as software is already covered by copyrights, but at the very least prior art is a good avenue to prevent some of the worst abuse of software patents.


- How do copyrights affect software libraries?

A lot of software is licensed to be used on a particular piece of hardware or only available via subscription. How does this affect software libraries? Should there be exceptions like there are for traditional libraries?

In the life cycle of software, the commercially exploitable time is limited, likely anything older than 10 years no longer has any commercial value.
Maybe copyrights on software should be significantly reduced to something like 10 years, which is more than enough to cover the commercially exploitable timeframe of the software life cycle.

Such a limit would greatly enhance the work of software libraries, increasing availability and ease of access as well as removing a lot of the red tape involving requests for permission to keep copies.


- What about software as a service?

And what about software as a service, where neither the source code nor the executables are ever published? How can something like Gmail be preserved, when neither the service's code nor the environment is available to the public?


- Preserving "illegal" or cracked copies?

What if a copy of a piece of software comes from an illegal source? A cracked version with modifications maybe? They have value in themselves as they are a cultural expression.

What if such an illegal copy is the only copy still available? Would it make sense to preserve that too?

By KaiRo, at 00:08 | Tags: history, Mozilla, preservation, software | 2 comments | TrackBack: 0

Feeds: RSS/Atom