On Visual Studio 2010 UI, you can only the one icon for all configurations. But when you open .csproj file as plain text, you can see it's real powerful. All per configuration settings are under tag like this:
So just find
Xueqing Sun
June 12th, 2011
You can distribute, remix, tweak, and build upon this work, even commercially, as long as you credit Xueqing Sun for the original creation.
I was reading a book, ‘Image Processing, Analysis, and Machine Vision, Milan Sonka, Vaclav Hlavac, Roger Boyle’ for studying mean-shift object tracking algorithm. But the book was really hard to understand, especially for a beginner of machine vision. Yesterday, with the help of Guangwei, a friend of mine (he is a scientist at National Astronomical Observatories, Chinese Academy of Science), I finally understood the algorithm. And I would like to explain what I understood for sake of helping other guys who are still struggling on this algorithm.
Heads up Questions
1. What is the probability density function of an image?
2. How does the density function contribute to target model?
3. What’s kernel function? How does it work?
Here is the summary of the algorithm
To locate a target on an image, we must have a target model, an image which contains the target. The goal of mean-shift algorithm is to tell you the position of your target on the image. Suppose we have a target model image with 5*5 pixels. The pixel values listed below:
x | y | pixel |
0 | 0 | 1 |
0 | 1 | 1 |
0 | 2 | 6 |
0 | 3 | 5 |
0 | 4 | 7 |
1 | 0 | 0 |
1 | 1 | 4 |
1 | 2 | 7 |
1 | 3 | 0 |
1 | 4 | 4 |
2 | 0 | 3 |
2 | 1 | 5 |
2 | 2 | 7 |
2 | 3 | 1 |
2 | 4 | 5 |
3 | 0 | 7 |
3 | 1 | 0 |
3 | 2 | 9 |
3 | 3 | 9 |
3 | 4 | 2 |
4 | 0 | 0 |
4 | 1 | 1 |
4 | 2 | 4 |
4 | 3 | 3 |
4 | 4 | 3 |
Target model image
The row x and y are the location of a pixel on the image. And the third row ‘pixel’ is the value. For convenient purpose, we limit pixel value from 0 to 9. Now we can create a frequency table like below. Normally images are colored, but we simplified the problem from colored images to gray scale images. A colored image can be considered as a composition of three gray scale images. So the rationale explained here can be easily applied to colored images.
bin | q |
1 | 8 |
2 | 1 |
3 | 3 |
4 | 3 |
5 | 3 |
6 | 1 |
7 | 4 |
8 | 0 |
9 | 2 |
Frequency
The frequency means how many pixels located in a specific bin. For example, the first row means there are 8 pixels which values are less or equal to ‘1’. Furthermore we can have histogram chart for the table.
Histogram
On the book, and many other materials, the density function of target model, defined as below:
1 |
The density function is closely related with the frequency table. Here the black is a vector which has m components. In this case, the m is 9 because we have 9 bins. Each component is made from corresponding bin. Let look into the component definition.
2 |
The definition looks somewhat complex. Let’s split it into parts so we can understand it easier. First, let’s see the delta part
3 |
n is the pixel number. In this case we have 25 pixels. So the n is 25. is the coordinate of ith pixel in the image. For example, the first pixel coordinate is (0,0), and the second is (0,1). u means we are currently defining uth component corresponding to uth bin. Function
tells which bin the pixel
go. For example, the first pixel (0,0) is 1, so it goes to bin 1, i.e.
.
Actually, the formula 3 is a vector, let’s rewrite it as following:
4 |
The delta function gives you 1 if the pixel goes to bin u, otherwise it gives 0. So this vector works just like a mask. On the mask there are ‘1’s mark the pixels go to bin u. For example, if our image is 4*4, we might get a D vector like this:
1 | 0 | 0 | 1 |
0 | 1 | 1 | 0 |
0 | 0 | 0 | 1 |
1 | 0 | 0 | 0 |
The image is a matrix. D is a vector. Just put each row of the matrix to one row in sequence then we’ll get D.
IMPORTANT
What’s the relationship between D and the frequency table? The answer is for each u,
. Actually we can form feature space only based on D, because D already represents the image in statistics way. So mean-shift algorithm is so called an algorithm that based on probability density function. But marginal pixels are unstable, so the kernel function which is a weight function does the work. It give marginal pixels less weight to make the objective function in optimization process more stable.
OK, let’s see another part of formula 2. It’s
5 |
Again it’s a vector. Function k is the kernel function which actually is a weight function giving each pixel position a weight. So can be defined as below:
6 |
It looks beautiful. I really hate which makes formula unreadable.
OK, now we get the density function of target model image. The candidate density function is pretty similar.
7 | |
8 |
Formula 8 is easier to understand because we’ve already understood formula 2. The k function in formula 8 is just a transformed version of original k.
I think I’ve solved the hardest part of the mean-shift algorithm. And the rest of it would be much easier to understand because it is just kind of an optimism algorithm to find the y.
Though Windows isn’t a real-time OS, we do need high resolution clocks and timers sometimes. .Net gives us Stopwatch class which is really a high resolution clock. But we can’t find a high resolution timer in .Net framework. A timer is different from a clock, which fires event periodically notifying you the period of time is up. This article is mainly to demonstrate the basic usage of interop. And a more complex and intensive material provided at the end of the article.
There are high resolution timer services on Windows. But they are native APIs. Fortunately, we can use interop to call these native APIs.
The first thing you have to do is to determine the device capabilities. Different devices have different resolutions[1]. Some may give you 1 ms of resolution, but some may be 10 ms. And if you request a resolution that the device can’t support, an error will occur. To determine the resolution of the timer hardware, use timeGetTime().
[DllImport("winmm.dll")]
private static extern int timeGetDevCaps(ref TimerCaps caps, int sizeOfTimerCaps);
TimerCaps is a struct. A C struct actually. We can use C# struct to map the data.[StructLayout(LayoutKind.Sequential)]
public struct TimerCaps
{
/// <summary>
/// Minimum supported period in milliseconds.
/// </summary>
public uint periodMin;
/// <summary>
/// Maximum supported period in milliseconds.
/// </summary>
public uint periodMax;
}
Please keep in mind that “LayoutKind.Sequential” is essential, because it suppressed C# compiler from alter the order of the fields for optimization purpose. Here for the parameter “TimerCaps caps” we use “ref”, because we want the API to write some data into “caps”. The “ref” maps the type to the pointer of the type in C/C++. So “ref TimerCaps caps” means “TimerCaps *caps” in C/C++.
Now we can call the API.
TimerCaps caps = new TimerCaps();
int ret = timeGetDevCaps(ref caps, Marshal.SizeOf(caps));
And then we can start the timer.
[DllImport("winmm.dll")]
private static extern int timeSetEvent(uint delay, uint resolution, MmTimerCallback proc, int user, int mode);
...
actuallPeriod = Math.Max(caps.periodMin, Period);
timeBeginPeriod(actuallPeriod);
timerId = timeSetEvent(1000/*delay*/, 0/*resolution*/, TimerCallback, 0, 1);
“MmTimerCallback” is a delegate method mapping to callback function in C/C++. That’s almost the complete stuffs of the high resolution timer. Please be aware, the callback function will be called in a thread other than your main thread. You have to delegate the operation to your main thread if you want to access UI elements.
Here is a good read material talks about interop.
There are several ways of letting an application to quit programmatically. But you should choose the best one among them.
This method will force a process to terminate. So the process has no chance to do cleaning up work. No code after Environment.Exit will be executed. Any data that is unsaved will be lost. Not recommend this way.
Killing a process just like Environment.Exit, terminates the process immediately and the process doesn't have the last chance to breath. So again, it's not recommended.
This method will send a request to a process' message loop, if the process does have a message loop. The process thus has a chance to do cleaning up work, and save unsaved data to disk then terminates gracefully. This method can be called either from another process or within the process that to be terminated.
Environment.Exit Method http://msdn.microsoft.com/en-us/library/system.environment.exit.aspx
Process.CloseMainWindow Method http://msdn.microsoft.com/en-us/library/system.diagnostics.process.closemainwindow.aspx