I'm trying to get the Eclipse IDE (Integrated Development Environment) setup and working on my recently repaired Linux system, and simultaneously I want to become familiar with the ins and outs of its operation. Nothing better for this than tackling a small programming project. I have an old program that I've been meaning to port to Linux, so this looks like the right time and place to tackle it. dedup is not a fancy program, but it can be useful, especially for someone like me who copies whole directories willy-nilly just to make sure I have saved a copy of what I need.
I started working on this program about a week ago, but I was so muddle headed I wasn't making any headway. I don't know if my allergies were acting up or I had some kind of bug, but in case I wasn't sleeping well and the result was that my head felt like it was full of wool. This weekend things cleared up a bit and I was able to think clearly about this program. And then, because I have found that writing down what I am trying to accomplish often helps clarify the problem, I wrote up the following explanation. I have since implemented my revised plan. It needs a little some testing and polishing, but I expect to post it before too long.
dedup is a program for deleting duplicate files
Given a starting point on a directory tree, it follows each branch to the end, then backs up to the last node and takes the next branch.
When it encounters a file, as opposed to a directory, which could be anywhere on its travels, it marks the spot and then proceeds to explore the rest of the tree. If it encounters a file with same name, it compares the length of the two files and then, if they are the same, it compares their contents. If the contents are identical, it deletes the most-recently-encountered file.
It then returns to the previously marked spot and resumes it exploration of the directory tree.
The first version of this program returns to the 'previously marked spot' by starting over and counting the number of files it encounters. For the first file this won't take too much time or trouble, but if the directory tree contains a large number of files, by the time you get towards the end, it could be taking a very long time.
As a matter of fact, this version of the program will completely traverse the directory tree once for each unique file found therein. It is not what you would call time efficient, but then it doesn't use very much memory.
A slightly improved version would 'mark the spot' by writing down its current location in the directory tree, and then when it had finished processing one file, it could go immediately back to the 'marked spot' without having to traverse the entire tree all over again. This would speed things by approximately a factor of two, and it wouldn't require any more memory, or at least not enough to notice.
Another way to do this would be to record the name of every file and the pathname of the directory where that file is stored. A simple sort operation on the list of filenames would enable you to quickly identify candidates for comparison and possible deletion. The problem with this is that if you have a very large number of files, this could take a large amount of memory. However, computers these days, have large amounts of memory, so the speed advantage should make this worthwhile.
So the way we are going to do this is we will traverse the entire directory tree once, recording the names and paths of all the files as we go. Sort the list of filenames, then scan the list looking for any duplicates, compare the metadata for the files, and if warranted, compare the files themselves. If identical, then we delete the newer of the two files.
Filenames and paths are stored in a string table. Each entry in our list of filenames will then simple hold the two pointers into this string table, one to the pathname, and one to the filename.
It might be more memory efficient to break up the pathnames into sequences of strings and store the indices for the those individual bits, but that would complicate matters as we would need to constantly be searching for previously used pathname segments, which would mean a binary search routine and an insert routine. I'm not sure the memory saved would be worth the effort
involved in coding or execution.
The last problem is how to decide how much memory to allocate in the first place, and then if you discover it is not enough, how to allocate more is such a manner that all your work is transferred transparently to the new allocation.
Silicon Forest
If the type is too small, Ctrl+ is your friend
Showing posts with label Eclipse IDE. Show all posts
Showing posts with label Eclipse IDE. Show all posts
Monday, June 20, 2016
Monday, April 25, 2016
Lana & Linux
![]() |
| Lana Del Rey in Linux Mint Update |
P.S. I got the current version of Eclipse installed and it's working pretty well. So far.
Sunday, April 24, 2016
Goclipse
| Eclipse IDE Icon by Necromod |
I've been playing with Google's Go programming language using the Coding Game web site. It's great for playing around, but now I want to try some actual stuff, which means I need a compiler on me computer. Traditionally development has been done using a text editor, a compiler and a debugger running on a terminal, or after the dark ages, running in a terminal window (aka a DOS box) on a PC. Microsoft brought us Visual Studio which integrated these components into one program, hence the IDE tag, which stands for Integrated Development Environment. Visual Studio was great, but it was designed specifically for developing programs for Windows. You want to work somewhere else, you were out of luck. Out of luck until a growing army of programmers decided that they had had enough of the dark ages and they wanted a shiny, new IDE too. So they wrote one. Or a hundred.
Eclipse is one of those IDE-s. It's been around for awhile. It runs on Linux. I installed it on my machine yesterday, but somehow I got version 3.8.1 instead of the current version, 4.5. I probably missed an update somewhere in the last couple of years.
Anyway, I'm looking for the bits and pieces that will set up Eclipse to handle Go and I come across this little note:
Github is a big source code repository.If you are behind the Great Firewall of China, you are very likely to encounter problems installing GoClipse: blocked connections, timeouts, or slow downloads. This is because the update site is hosted in Github, which is blocked or has limited access.
Note for users in China
Thursday, March 26, 2009
Nvidia Twinview & Linux
I finally got around to installing the Nvidia driver that supports two screens. This is the proprietary Nvidia driver, not the open source driver. It wasn't too much trouble and it seems to work pretty well. Downloading and installing the driver went very smoothly. The hiccups came with the configuration.
Update: Chewearn's blog was very helpful.
My hardware:
- ViewSonic VA902b LCD flatpanel display, 1280 x 1024 resolution.
- Princeton VF912 19" CRT with resolution out the ying-yang (data sheet).
- Nvidia 7600 GT dual head video card.
Both of my screens are roughly the same size, so I wanted to set the resolution to be the same on both. The first resolution drop down box in nivdia-settings has a list of maybe a dozen settings, but it does not have 1280 x 1024. It has several that are smaller, and one that is bigger. Bugger. Some fooling around and I click on the resolution box for the other screen and it has got a list of a about a gazillion settings. Whoa! And one of them IS for 1280 by 1024. Okay, so we will switch the cables on the ports on the card. But now the LCD screen complains the signal is NOT IN RANGE. Bah. Now what? 1280 by 1024 IS the correct resolution. Is there something else we can change? The second part of the resolution line contains the vertical refresh frequency. I end up setting it to 75Hz and that seems to work.
The instructions I found warn not to Save to X Configuration File, it will bugger your system. It means running nvidia-settings after each re-boot, but at least it works. The instructions were written a few months ago, and things change, so I thought I would try it and see if it works. Especially after I rebooted and then ran nvidia-settings again. I click on apply and everything disappears. I have to quick-lean-down-and-swap-the-cables-before-the-five-second-time-limit-expires-and-everything-reverts-to-the-way-it-was. It takes three or four tries before I manage. Okay, we certainly don't want to have to deal with this every time we reboot so let's see if writing to the config file works or not.
The first time doesn't, but that really isn't surprising, I was not master-of-my-universe (sudo, or super user do). Invoking the program with the sudo prefix (sudo nvidia-settings) fixed that, and rebooting worked fine. So far. This is what the setup looks like in the dark. Aren't all the LED's pretty?
I have encountered some minor bugs, but basically things are working okay. The main reason I wanted this was for working with Eclipse (a software development tool) it wants to put up all these panes, and there just isn't enough room on one screen. The way Twinview works here is kind of interesting. If you click on the maximize button, the program window will fill one screen. So how do you get it to fill two screens? Do I have go back and make a major configuration change? No, simply unmaximize the window, and then stretch to fill both screens. Kind of backwards, but it works.
As far as bugs I have noticed two things:
- Title bars on windows sometimes disappear. Somethings they just go white, sometimes they vanish completely and the window title is appears in some kind mutated kaleidoscope font. When they go white, the window control buttons still work. The tool tips still pop up when you are over them.
- I was running Firefox yesterday and I opened a bunch of windows and the screen I was working on started fading to gray. I would click something and it would come back, but then it would go to gray immediately. Things went downhill from there.
Update November 2016 replaced missing pictures.
Sunday, March 1, 2009
Fun With Numbers
I have been fooling around with a program to compute the Ackermann function. I say fooling because I haven't really put any serious analysis into it. I have just been trying different things to see what happens.
One thing I did was install the GMP library on my Linux box. I changed my program to use the functions provided by this library instead of just performing regular arithmetic. Being as the results of the Ackermann function can run into thousands of digits, I was going to need something. I thought about writing one myself, and that might have been fun, but let's try and focus, shall we?
I am beginning to think that a simple minded implementation of Ackermann is not going to work. I may be wrong, but I suspect that there might not be enough virtual memory on the planet to accommodate a simple minded solution.
Say the answer (to a particular evaluation) runs to 10,000 digits. Two to the 10th power (2^10) is 1024, or roughly one thousand, or three decimal digits. 10,000 digits is roughly 1000^3333, or 2^33330. A trillion is roughly 2^40. So say we can get by with 16 bytes (2^4) of stack space for each call. So if the depth of our recursion is proportional to the number of digits in the answer, then we will need 2^33334 bytes of stack. A terabyte is 2^40. 2^33334 / 2^40 = 2^33,294.
Notice that neither the number of bytes of stack per call nor the number of bytes in a terrabyte make any real impact on the original number.
Of course it may be that the amount of stack space required is NOT proportional to the size of the result, in which case it might be possible to arrive at a solution.
But then there is the time involved. Even at the fastest clock speed incrementing a number from zero till it has 10,000 digits is going to take a long time. Strikes me now that the amount of time required is going to be similar to the amount of space we computed above. In other words, we won't have an answer before our Sun burns out, or goes Nova.
So how do we compute the value of the Ackermann function? Someone was put together a simple mathematical expression that purports to compute the value of the Ackermann function. So one fairly simple test I could do would be to implement that expression and see it if delivers the same result as reported elsewhere.
But how do know if that is the correct answer? That is going to require analysis of the original definition to see it does in fact correspond to the expression in question. That is going to require thinking.
Of course the real question, is why am I even fooling with this? Same reason I do the Jumble (most) every morning: I need a little mental exercise everyday. Keeps me from getting bored. Also I am acquiring L33T Linux skills by mucking about with GMP and Eclipse.
One thing I did was install the GMP library on my Linux box. I changed my program to use the functions provided by this library instead of just performing regular arithmetic. Being as the results of the Ackermann function can run into thousands of digits, I was going to need something. I thought about writing one myself, and that might have been fun, but let's try and focus, shall we?
I am beginning to think that a simple minded implementation of Ackermann is not going to work. I may be wrong, but I suspect that there might not be enough virtual memory on the planet to accommodate a simple minded solution.
Say the answer (to a particular evaluation) runs to 10,000 digits. Two to the 10th power (2^10) is 1024, or roughly one thousand, or three decimal digits. 10,000 digits is roughly 1000^3333, or 2^33330. A trillion is roughly 2^40. So say we can get by with 16 bytes (2^4) of stack space for each call. So if the depth of our recursion is proportional to the number of digits in the answer, then we will need 2^33334 bytes of stack. A terabyte is 2^40. 2^33334 / 2^40 = 2^33,294.
Notice that neither the number of bytes of stack per call nor the number of bytes in a terrabyte make any real impact on the original number.
Of course it may be that the amount of stack space required is NOT proportional to the size of the result, in which case it might be possible to arrive at a solution.
But then there is the time involved. Even at the fastest clock speed incrementing a number from zero till it has 10,000 digits is going to take a long time. Strikes me now that the amount of time required is going to be similar to the amount of space we computed above. In other words, we won't have an answer before our Sun burns out, or goes Nova.
So how do we compute the value of the Ackermann function? Someone was put together a simple mathematical expression that purports to compute the value of the Ackermann function. So one fairly simple test I could do would be to implement that expression and see it if delivers the same result as reported elsewhere.
But how do know if that is the correct answer? That is going to require analysis of the original definition to see it does in fact correspond to the expression in question. That is going to require thinking.
Of course the real question, is why am I even fooling with this? Same reason I do the Jumble (most) every morning: I need a little mental exercise everyday. Keeps me from getting bored. Also I am acquiring L33T Linux skills by mucking about with GMP and Eclipse.
Tuesday, February 24, 2009
Cooking up something ...
Learning to use a new software program is like going into someone else's kitchen and trying to fix a meal. You know they have everything you need to get the job done, but you have no idea where they have hidden it, so you spend all your time looking for the stuff you need. And if they happen to be a little out of the mainstream (aren't we all?) the tools they have will be almost unrecognizable and stored in really strange places.
I've been playing with Eclipse, which is a program for software development, and it is a lot like that. Plenty of buttons right up front that do things that either I don't want or need, or else they do things that I do not even comprehend. All the "simple" stuff, is buried in obscurity and requires the patience of Jobe (is that the right reference? Jobe?) to locate.
I have been using Microsoft's Visual Studio (Microsoft Visual C++ 6.0, actually) for the last ten years or so, and I have become used to the way things are done there. But it is Microsoft's, and I really don't want to be there, so Eclipse it is. And I am learning, slowly. I've been able to compile and debug some simple programs, and I have learned to deal with some of the quirks. What I really need though, is a bigger screen. Until I learn what all this stuff that is cluttering up the screen is, I am reluctant to delete any of it. Shoot, at this point I don't know if I could delete it.
I think I am going to try and go to a dual screen setup, which means I am going to have to do some major rearranging in my office. Another chore to add to my infinite list of things to get done.
Update January 2107 replaced missing picture.
Wednesday, February 18, 2009
On My Mind
There are a bunch of things on my mind these days. I put ten grand through my checking account last month: $10,000 in and a like amount out. Granted the bigger half of it was just transfers from college funds to colleges, but still, it was a stink load of money.
I got a very nice Canon digital camera for Christmas. It's a 10 mega-pixel Elf. Ten mega-pixels is huge. With the default memory card (32 mega-bytes) it would only hold 11 pictures. 10 mega-pixels give a picture that is over 3,000 pixels square. My fancy schmancy desktop LCD is only 1280 by 1024. It would take nine of these screens to show one 10 mega-pixel picture.
So I got to wondering how many pixels is enough? Display screens are going up in pixels and resolution, but how far will they go before the market is satisfied? How many pixels can the eye see? I know the center of your field of vision has a higher resolution than the periphery, but just how good is it?
I remember a printer ad I saw some years ago that had three pictures of a woman in a red, white and blue bathing suit. The text went something like this: At 100 dots per inch, you can see that the woman is wearing a bathing suit. At 300 dots per inch you can see that it is wet. At 1000 dots per inch you can see that it is painted on. The accompanying pictures, printed at the three resolutions, demonstrated this.
Anyway, I wanted to study up on this whole business of what we can see, and how we see it.
Then there is carbon dioxide. Never mind the climate change mongers, I was more interested in when does carbon dioxide become toxic. How long can you survive in a closed room? How much fresh air do you need? I started looking into this a few weeks ago but got sidetracked. One thing I did learn was the CO2 makes up less than one percent of the atmosphere. Oxygen is about 20%, and Nitrogen is about 70% and the rest is lost in a haze.
Of course, once I've got this all sorted out there is the whole climate change business. The one thing I have been looking for, and have not seen, is a good one page summary of where CO2 is coming from and where it's going. Some people like to point fingers at the automobile, but without a good overview of the situation I am not sure they should.
Then there's the Ackermann function. I came across it recently in Stu Savory's blog, and I've been playing with it on my computer. Using it to learn how to use Eclipse, a software development program for Linux. Also learning how to use a math library for doing arbitrary precision (numbers as big as you want to make them).
Don't forget Betrand, a "functional programming language". I've been messing about with it also. It has an option to display it's output graphically, but the program is so old, I am not sure the graphics package still exists, or if it does, if it works. There should be some way to adapt it to work with a web browser, but that is going to take some study, or luck.
The kids were home from college this weekend. It was nice to see them. Oldest son brought home a cold and a cough which he very kindly passed on to me. I still had a low grade sinus thing going on, so when you add this new thing, I was pretty miserable, so I have resumed a course of antibiotics that I started last month and then abandoned for reasons that are not really clear.
I'm also working on re-reading a book ("Drown All The Dogs" by Thomas Adcock) I read last month. After I put it down I couldn't remember what I read, so after a couple of weeks I picked it up and started over. So far, everything I read, I remember reading before, but it's not like it's boring, it's like reading it for the first time. Weird. Anyway, this time I am making a bunch of notes and I hope to have something intelligent to say when I finally do finish it.
I got a very nice Canon digital camera for Christmas. It's a 10 mega-pixel Elf. Ten mega-pixels is huge. With the default memory card (32 mega-bytes) it would only hold 11 pictures. 10 mega-pixels give a picture that is over 3,000 pixels square. My fancy schmancy desktop LCD is only 1280 by 1024. It would take nine of these screens to show one 10 mega-pixel picture.
So I got to wondering how many pixels is enough? Display screens are going up in pixels and resolution, but how far will they go before the market is satisfied? How many pixels can the eye see? I know the center of your field of vision has a higher resolution than the periphery, but just how good is it?
I remember a printer ad I saw some years ago that had three pictures of a woman in a red, white and blue bathing suit. The text went something like this: At 100 dots per inch, you can see that the woman is wearing a bathing suit. At 300 dots per inch you can see that it is wet. At 1000 dots per inch you can see that it is painted on. The accompanying pictures, printed at the three resolutions, demonstrated this.
Anyway, I wanted to study up on this whole business of what we can see, and how we see it.
Then there is carbon dioxide. Never mind the climate change mongers, I was more interested in when does carbon dioxide become toxic. How long can you survive in a closed room? How much fresh air do you need? I started looking into this a few weeks ago but got sidetracked. One thing I did learn was the CO2 makes up less than one percent of the atmosphere. Oxygen is about 20%, and Nitrogen is about 70% and the rest is lost in a haze.
Of course, once I've got this all sorted out there is the whole climate change business. The one thing I have been looking for, and have not seen, is a good one page summary of where CO2 is coming from and where it's going. Some people like to point fingers at the automobile, but without a good overview of the situation I am not sure they should.
Then there's the Ackermann function. I came across it recently in Stu Savory's blog, and I've been playing with it on my computer. Using it to learn how to use Eclipse, a software development program for Linux. Also learning how to use a math library for doing arbitrary precision (numbers as big as you want to make them).
Don't forget Betrand, a "functional programming language". I've been messing about with it also. It has an option to display it's output graphically, but the program is so old, I am not sure the graphics package still exists, or if it does, if it works. There should be some way to adapt it to work with a web browser, but that is going to take some study, or luck.
The kids were home from college this weekend. It was nice to see them. Oldest son brought home a cold and a cough which he very kindly passed on to me. I still had a low grade sinus thing going on, so when you add this new thing, I was pretty miserable, so I have resumed a course of antibiotics that I started last month and then abandoned for reasons that are not really clear.
I'm also working on re-reading a book ("Drown All The Dogs" by Thomas Adcock) I read last month. After I put it down I couldn't remember what I read, so after a couple of weeks I picked it up and started over. So far, everything I read, I remember reading before, but it's not like it's boring, it's like reading it for the first time. Weird. Anyway, this time I am making a bunch of notes and I hope to have something intelligent to say when I finally do finish it.
Monday, February 16, 2009
Stack Size
Last week Stu put up a post that included a mention of Ackermann's function. It looked very simple, and being inclined to avoid doing any real work, I thought I would write up a little program to exercise it. The program was very simple, but the results were a little disappointing, if not unexpected. Stu warned that the results quickly exceed the bounds of the native 32-bit arithmetic of your typical Pentium processor.
My immediate problem with the program was not the arithmetic, but the stack. Ackermann's function is recursive, which means it calls itself, and as simple minded as it is, it has calls to itself nested within calls to itself. Kind of like wheels within wheels. So the problem is that it runs out of stack space and crashes which means it doesn't deliver any results at all.
I have been using Microsoft's (boo! hiss!) Visual C++ (Version 6.0 last copyright 1998, a very good product) for quite a while so I whipped up the first version of the program there, and when it ran out of stack space I was able to find the command to increase it (/stack:number).
But stack space is the least of the problems with this program. The other problem is the ridiculously large numbers it produces. So now we are operating in the realm of fantasy, and Windows, as wonderful as it is, is not going to cut it. So off to Linux we go. Besides, I had installed the Eclipse software development program on my Linux box, and I needed to learn how to use it. This looks like a prime opportunity.
After a bunch of fiddling around with Eclipse and my program, I finally get it to compile and run. All is well. Now, how do you set the stack size for this program? Help is no help at all. A bunch of Googling turns up ulimit, a program you can run from a terminal window to set a bunch of different memory allocation limits. So I set ulimit to some gloriously large number like a billion and let my program run overnight. It did nothing. It did not run out of stack space, which is something, but it also failed to complete the computation of A(4,0). Bah.
So I'm playing around with ulimit this morning and I learn a few things. After the first time you set the stack size, you can only reduce it. You can enlarge it only the first time you run ulimit. ulimit works in units of 1KB, so the default setting of 8192 is equivalent to 8 megabytes. Anything above 4 million or so (2 to the 21st power, which translates to 4 gigabytes, the most memory that can be handled by a 32-bit processor) causes the stack size to be unlimited. 7 or below will keep my program from running. The shell (or CLI (Command Line Interpreter) if you prefer) immediately reports back killed.
My immediate problem with the program was not the arithmetic, but the stack. Ackermann's function is recursive, which means it calls itself, and as simple minded as it is, it has calls to itself nested within calls to itself. Kind of like wheels within wheels. So the problem is that it runs out of stack space and crashes which means it doesn't deliver any results at all.
I have been using Microsoft's (boo! hiss!) Visual C++ (Version 6.0 last copyright 1998, a very good product) for quite a while so I whipped up the first version of the program there, and when it ran out of stack space I was able to find the command to increase it (/stack:number).
But stack space is the least of the problems with this program. The other problem is the ridiculously large numbers it produces. So now we are operating in the realm of fantasy, and Windows, as wonderful as it is, is not going to cut it. So off to Linux we go. Besides, I had installed the Eclipse software development program on my Linux box, and I needed to learn how to use it. This looks like a prime opportunity.
After a bunch of fiddling around with Eclipse and my program, I finally get it to compile and run. All is well. Now, how do you set the stack size for this program? Help is no help at all. A bunch of Googling turns up ulimit, a program you can run from a terminal window to set a bunch of different memory allocation limits. So I set ulimit to some gloriously large number like a billion and let my program run overnight. It did nothing. It did not run out of stack space, which is something, but it also failed to complete the computation of A(4,0). Bah.
So I'm playing around with ulimit this morning and I learn a few things. After the first time you set the stack size, you can only reduce it. You can enlarge it only the first time you run ulimit. ulimit works in units of 1KB, so the default setting of 8192 is equivalent to 8 megabytes. Anything above 4 million or so (2 to the 21st power, which translates to 4 gigabytes, the most memory that can be handled by a 32-bit processor) causes the stack size to be unlimited. 7 or below will keep my program from running. The shell (or CLI (Command Line Interpreter) if you prefer) immediately reports back killed.
Subscribe to:
Posts (Atom)


