Working with Balder really gets me nostalgic. Nowadays, working with graphics programming one would choose OpenGL, DirectX or Xna to get real performant. Back in the days (mid 1990s), we didn't have the luxury of high-speed CPUs and dedicated 3D accelerated graphics adapters. Everything had to be done by the CPU. Balder has turned out to be just that kind of project, everything is done by the CLR and then the CPU.
Balder is now closing in on the beta stage, which means that we're getting close to the featureset we want to have for version 1 of the engine. While completing the features for the engine, we're also dabbling a little bit with optimizations. The true optimizations will happen in the final stages between beta and release, but its natural while working with features to increase performance on some parts of the engine. We've only scratched the surface of what is possible to optimize, the potential is great.
The first optimization we did was a couple of weeks ago, when getting Balder ready for Silverlight 3. Silverlight 3 introduces something called WriteableBitmap which gives access to a pixel buffer that can be manipulated pixel by pixel. After seeing the results from René Schultes speed tests, its obvious that WriteableBitmap is the right choice for the job, allthough I put quite a bit of work into the Png streamer.
Over the course of this weekend, I managed to get quite a bit of work done – focusing on optimizations. One of the focus areas we've been having is to get things running in parallel. Silverlight has great threading capabilities and are able to utilize multiple CPU cores very efficiently. In the release version of Silverlight 3, Microsoft has allowed for cross thread access to the pixels in a WriteableBitmap. This is good news for us. The previous version, the 0.8 alpha version, had one thread doing all the job. This was not exploiting the posibilities, and left the CPU pretty much doing nothing half the time. The new solution has 3 buffers and 4 threads, one thread for syncronizing all the work being done and one for each of the buffers. The buffers has special purpose; one for clearing, one for rendering and one for showing – showing being the copying of pixeldata to a WriteableBitmap.
The flow is as follows:
In addition to this, the rendering is no longer working on bytes for every color component, but writes an entire 32 bit int to the buffer for every pixel. These optimizations has truely paid off, on my computer with the regular teapot test (about 1000 polys), gives a framerate between 50 and 60. This is promising, considering we've hardly started optimizing at all.
A demo can be found here. It should look something like this:
I've been a Mac owner for about a year now – it takes some time getting used to coming from a hard-core Windows mindset. My goal, entering this new stage, was to suck in all the experience possible from the platform and use Windows as little as possible, but being a .net developer, Mono and MonoDevelop does not quite cut it – so Visual Studio is needed to do proper .net development. With my first Mac I had a Boot Camp partition with Windows installed and used Parallels to access the same installation as well. This opened up a can of trouble when going back and forth between Boot Camp and Parallels because of drivers and such, not to mention Parallels services installed. After a while I figured it was probably better to have a dedicated virtual machine existing in a virtual image.
Several months later, I'm finally content with the speed of my installation. Here are my findings so far.
Stripped down Windows
If you're like me, not using Windows for much – in my case I have basically 3 things installed; Visual Studio, SQL Server and SQL Server Management studio, you are probably better off having a stripped down Windows. This can be achieved in many ways. One way is to take a standard Windows installation and start disabling services and removing Windows features. Another way is to get something called TinyXP. TinyXP is a distribution, not from Microsoft, that has most features removed. There is a boot meny were you can chose what you want installed, and the bare minimum does not include internet explorer, mediaplayer or outlook express. Most "nice to have" services are disabled as well. You can also chose to install it without any driverpacks, to keep the size down. For a virtual machine were the hardware is constant, this is probably the best option. TinyXP as mentioned, does not come from Microsoft, so using it is most likely a violation of the license. Instead of using TinyXP, one can use a tool called nLite or vLite if you prefer Vista. The tool allows you to take your existing XP or Vista install CD/DVD and select what features you want to include. When you're done, the tool will create a new installation CD/DVD for you.
I prefer running Windows XP, as it has the basic featureset I need and Visual Studio and all versions of .net runs on it.
SCSI vs IDE
As with all VM software, Parallels also comes with an option of running the harddrive on an emulated IDE or SCSI bus, the default being IDE as this is more compatible and Windows has default drivers for handling this. My experience so far, is that running on SCSI gives a lot better performance, especially using Visual Studio. It doesn't matter what kind of physical disk you're having, you can still use SCSI even if your physical drive is IDE. For Windows XP you can't simply switch to SCSI by default. Windows XP detects the SCSI adapter to be a BusLogic one, and this one will crash during boot. So, the trick would be for instance to set your CDROM to be SCSI and leave your primary harddrive as IDE and then boot into Windows and go to the device manager and install the SCSI driver which is most likely marked with a question mark stating an error with it. The drivers are located on a floppy image located in /Library/Parallels/Tools/drivers.fdd – mount this in Parallels and install the drivers from the mounted floppy inside Windows XP.
Parallels options – Optimization
In Parallels 4, there was great improvement to handling OSX caching. By default, OSX will enforce caching for any files used, meaning that your virtual harddrives could potentially eat up a lot of your memory when in use. On the optimization page, you should leave the "optimizate performance for" option on Virtual Machine, and I prefer to use the Adaptive Hypervisor. What this means, is that it will change focus between OSX and your Virtual Machine for utilizing the CPU and other resources. Also worth mentioning is to set the Better Perfomance option and untick the "Enable battery in virtual machine", that ensures full speed-ahead – allthough draining your battery.
In Parallels there are several producitivity such as shared folders, shared user profile, shared applications, smart mount and such. My experience is that enabling Shared Profile and Shared Applications lags things a bit, so keeping these off will boost performance a bit.
Parallels has the option of virtualization your diffferent cores, you can select how many cores you want to use for your virtual machine. I go with the number of cores I have (which is 2 on my Macbook Pro), there is quite a performance boost by doing so.
3D Acceleration + viewmode
One of the neat features of Parallels introduced in Parallels 3 was the ability to support 3D acceleration. This is really nice if you're using any 3D software or playing games, but for doing software development this does not make any sense. In fact, it seems to slow down regular graphics a bit and if you're using software such as Blend or Visual Studio 2010 that relies on WPF (Windows Presentation Foundation), it seems to be even slower. My tip is to leave this off and just allocate enough memory to cope with your screen resolution, typically 16MB should be enough.
Another thing that Parallels comes with is the ability to run in what is called a coherence viewmode. This makes your Windows applications appear as if they are a part of the OSX desktop and they float around like any other OSX window. This is great if you're in a multi monitor environment and want to keep some of your Windows windows on seperate monitors. But, there is a downside, performance wize. I prefer running in fullscreen, even if I run on a multi-monitor setup.
Since my Virtual Machines are primarily there to host Visual Studio, I have chosen to disable both sound and USB. Parallels supports virtualization of USB, meaning that the entire USB hub is available within the Virtual Machine and anything connected to any USB port can potentially be connected directly in the Virtual Machine. If you don't really need these, this is another performance boost.
This tip is more of a best practice regardless if you're running in a virtual machine or not; have a seperate partition or drive for your swap file(s). In Parallels I've chosen to have a seperate harddisk. This boosts performance quite a bit.
Fixed size disks
By default Parallels creates growing disks, this is quite a bit slower than a fixed size disk. Since I have several virtual machines, I create my images with a fixed size and set them to the bare minimum, typically 8 or 10 GB and all my data is on a seperate virtual harddrive that all virtual machines share.
Even if I run on a stripped down Windows, some services are still left on after installation. Manually turning these off will increase performance of the Windows installation and use less memory, some of the services you might find running and might not need are:
* Image acquisition – typically used for scanners and such
* Printing – I don't print from Windows, in fact I never print at all – environmental awareness
* Firewall – I don't need it, I have a firewall on the OSX side already and run in Shared Network mode, besides I never surf the web on my Windows installations
Why don't you just buy a Windows machine?
This is a question I get a lot. Well, there is no simple answer. At this point in time, I want to explore different things and don't get locked down. I must admit that the look and feel of Apple hardware is quite appealing as well. 🙂 Even though all of the above research has taken some time to figure out, it has been well worth spending the time. I now have the flexibility I want and can explore multiple operating systems, sure I could just install a "hackintosh" on a Virtual PC or something in Windows on a normal PC. I wanted the real deal and for now I'm content with the situation.
There is probably a lot more performance tips out there – I am still digging in my quest to optimize everything to perfection, if you have further tips please leave me a comment.
I'm working on a WPF 4 project, targetting .net 4 and need to use Visual Studio 2010 beta1 for this. For now I've been handcoding all my Xaml (I know, the designer in VS2010 is quite good – but I kinda love doing Xaml), but I wanted to do some more advanced graphics and I am somewhat used to working with Blend for doing that. So I decided to download the Blend 3 Trial and get going. Only thing was that the installer kept saying "Another installation is in progress. You must complete that installation before continuing this one". I downloaded the Windows Installer Cleanup Utility , but it didn't solve anything. In fact, I couldn't install it even, since it was a Windows Installer, had to install it on another computer and copy the files.
The event viewer was referring to the 'VSTA_IDE_12590_x86_enu' component and a missing directory. This is a package installed by Visual Studio, called Microsoft Visual Studio for Applications 2.0.
Turns out, the solution is really simple. Just create the directory and try installing again, in my case I had to reboot and then do the installation of Blend.
And, by the way, if you want to open your project in Blend 3, you need to change the target framework to something that Blend recognize, 3.5 or less. Read more over at Charles Sterling's blog.For me, I must maintain a seperate .net 3.5 project for my frontend for working with Blend as I am targetting .net 4.
Balder has been updated to support Silverlight 3 the release version.
The previous samples posted here now works with the release version.
You can find the samples directly here and here. And I've also created a version that just shows off that the texturemapping is very real, it can be found here. 🙂
We're working on new demos to truely show off Balder – its feature richness and capabilities are way beyond a rotating teapot. But focus on getting everything right in the engine has taken away energy to create a proper demo. Fear not, it will come. Stay tuned.
I guess most of you already know that Silverlight 3 is out. Exciting times. Version is truely a remarkable release in the history so far of Silverlight. With it there are some changes in the API for you who have been working the the beta releases of Silverlight 3. The most noticable for me was the WriteableBitmap class. The WriteableBitmap can be used if you want to draw things yourself onto a surface that can be rendered by Silverlight.
The first change lies with the constructor, previously it had an argument that told what type of pixelformat it should be in, this is now gone.
Secondly you no longer have to lock and unlock in order to write to it, you simple index the new property called Pixels and after you're done you call invalidate. This makes a lot more sense.
Third thing I notice is the SetSource() method, which I never seen before. I haven't had the chance to check it out yet, but I guess you don't have to be a rocket-scientist to know that it has the ability to take a stream source and stream the pixels into the bitmap.
All these changes are something that Balder will have great benefit from, initial testing seems that performance has gone up quite a bit and I believe there might be something to gain from streaming the pixels instead of the way we are doing things internally in Balder today. Updates has been made to Balder to work with the release version of Silverlight 3 – new release will be posted on Codeplex shortly.
Earlier I posted about some extensions I did for Silverlight handling INotifyPropertyChanged and helpers for DependencyProperties. Recently I've had a couple of request to release a downloadable source with samples of their use. Since the original posts (found here and here), I've refined them a little bit and worked out some quirks that was left in the originals.
So, why should one use these kind of extensions and what do they solve?
INotifyPropertyChanged and creating DependencyProperties rely on the usage of literals. When for instance notifying the PropertyChanged event with a change on a particular property, the argument one passes in is the literal holding the name of the property. This is bad for at least a couple of reasons:
Refactoring goes out the window – renaming the property means renaming the literals by hand
Obfuscation – if one were to obfuscate the code, literals will still stay the same but your propertynames will change – your code is broken
I've wrapped it all in a nice download with both a Silverlight and a WPF version of the code (actually pretty much the same code, you'll find #if(SILVERLIGHT) #else #endif statements where specifics are needed). Also in the download, you'll find a Silverlight sample with a usercontrol implementing a dependencyproperty and a data object using the INotifyPropertyChanged extensions. In addition to this, there are a few other nifty helper classes and extensions for other aspects of both Silverlight and WPF development. Hope you'll find it handy.
The last year I've been close to having a split personality when it comes to the projects I've been working on. On my last count, I have 4 active CodePlex projects that I've created myself, and 4 more that I'm a member of:
I decided a couple of months ago that I will pick one project and focus on it only for now, instead of spreading my attention thin with working one hour here and there on different projects. The choice was kind of obvious on my part, Balder got my attention for now. My reasons for choosing that particular project as my main focus, are many. First of all, it is the most mature of the 4 mentioned above, secondly, it is truly something I love doing – graphics programming has always been close to my heart. We're now running the last leg on Balder to get to a version 1.0 release. When this release is out I will continue on one of the other projects. I haven't decided on which one yet, but my money is on OpenCompiler – creating compiler extensibility for C#.
So, for anyone holding their breath for any features or bug fixes put into anything else than Balder for the next couple of months, start breathing again. 🙂
Tollef Slaathaug is a senior developer at Baze Technology, a software company based in Porsgrunn, Norway. Tollef has great experience with working with optimizations in .net code. This show we’re talking about their project and how they have done things to get the desired performance.
The recording of this show was done a little more than a month ago. Normally I use Oovoo to do the shows with persons sitting remote, it has excellent audio and video quality. This day, my freebie account had expired – instead of opening up a new account, which I should have done (twenty twenty heinsight and all), I decided to use Skype and found a 3rd party software that could intercept Skype calls and save the streams to disk. All nice, so far.
Comes editing day; we had two takes, due to loss of connection and the application had therefore produced two AVI files. All good, at least I thought so. Turns out that it had created AVI files with multiple video and audio streams in them, two streams for video and two for audio, representing both cameras and microphones. Seemed reasonable enough. Only problem is that most editing software didn’t figure this out. On top of this, it had used Microsoft MPEG4 for video and audio was supposedly in WMA format, but most applications didn’t recognize it.
Anyhow, weeks went by, I went on a holiday, came back and was ready to start editing again. I figured I’d use GraphEdit that was part of the DirectShow and later in the DirectX SDK. Turns out it wasn’t there anymore. Googled away and found someone who had created a nice version themselves of it and created a graph and got all the streams out into separate files. I felt really proud of myself.
It was finally time to get it all into GarageBand and edit and get it all out. Trouble was not over. The application I used that hooked into Skype and streamed it all to disk had done something quite interesting. All the four streams I was sitting with had all different framerates, and some pretty obscure framerates as well (30.118 and such). So getting this in there was not something it would cope with.
Long story short, invoking some 10 audio and video tools, I managed to extract the audio in a sensible manner and editing could finally start.
Lesson learned: be wary of nifty tools found, check references. 🙂
Notes The intro and the outro music created by Kim M. Jensen.
Please do not hesitate to leave any comments on this post. If you have ideas for people you’d like to get interviewed, please leave a comment or contact me through the contact page.