03-07-2005, 10:39 PM | #1 | ||
College Benchwarmer
Join Date: Oct 2000
Location: calgary, AB
|
C++ Help
Need a little help here, it has been a while since I have programmed in c++ and I could really use the expertise here.
Here is the problem, I have several files of around a million lines. What I need to do is skip to a certain line and read in 135 lines of data. ie I want to skip to line 900000 and read in lines 900000 - 900134 then skip to line 950000 and read in, you get the idea. I have written something using getline in a while loop with an if statement but it is brutally inefficient. Does anyone here have some advice on what I should do? I have searched the internet for a while but all the solutions that appear helpful are at pay only sites. There had better not be a goto command or I'm gonna rip my hair out for not figuring it out. |
||
03-07-2005, 10:40 PM | #2 |
Grizzled Veteran
Join Date: Dec 2002
Location: Little Rock, AR
|
Procedural Programming = bad
__________________
Xbox 360 Gamer Tag: GoldenEagle014 |
03-07-2005, 10:48 PM | #3 | |
College Benchwarmer
Join Date: Oct 2000
Location: calgary, AB
|
Quote:
Do you mean that going for a certain line and hard coding it that way is bad? I know exactly which lines of the file the data will be in if that makes things easier. |
|
03-07-2005, 10:58 PM | #4 |
Grizzled Veteran
Join Date: Dec 2002
Location: Little Rock, AR
|
Well, with object-orientated programing, you just would call what classes and objects that you wanted from your driver program. it would take you awhile to rewrite everything, but you may have to do it.
__________________
Xbox 360 Gamer Tag: GoldenEagle014 |
03-07-2005, 11:07 PM | #5 |
College Starter
Join Date: Oct 2000
Location: Berkeley
|
I don't think what he's doing has anything to do with object oriented vs procedural. Unless I'm misunderstanding, his program needs to read in a ~million line text file and get some info that is always on lines n through n + 135.
I have limited experience with C++, but with VB you could read the whole file into a string and use split (with vbCrLf as the delimitter) to break it into an array from which you could then pull directly whatever lines you needed. Not sure how efficient that would be for very large blocks of text though. Last edited by Daimyo : 03-07-2005 at 11:10 PM. |
03-07-2005, 11:14 PM | #6 |
Grizzled Veteran
Join Date: Dec 2002
Location: Little Rock, AR
|
Ah, color me silly. I misread what you said. Thanks for point that out Daimyo. I figured you were doing some sort of clean up of an old program written a long time ago.
Give me a minute, I am sure I can come up with something.
__________________
Xbox 360 Gamer Tag: GoldenEagle014 |
03-07-2005, 11:20 PM | #7 | |
Grizzled Veteran
Join Date: Dec 2002
Location: Little Rock, AR
|
Quote:
__________________
Xbox 360 Gamer Tag: GoldenEagle014 Last edited by GoldenEagle : 03-07-2005 at 11:20 PM. |
|
03-07-2005, 11:35 PM | #8 |
General Manager
Join Date: Oct 2000
Location: The Satellite of Love
|
This is a problem with text files. There's no simple "seek line 900000 of 'thistextfile.txt'" type of function in C++. It sure would be simple to be able to type in textfile.seek(900000) and have it go right there. Any kind of function that sends you to a spot in a text file is going to be reading in each line to get you there anyway because simply, it has to. To my knowledge (and I've looked), there simply is no other way.
When it comes to text files, you just have to accept "brutally inefficient" or find a different way of storing data. EDIT: Unless each and every line consists of the same number of characters, that is. Last edited by sabotai : 03-07-2005 at 11:37 PM. |
03-07-2005, 11:35 PM | #9 |
College Benchwarmer
Join Date: Oct 2000
Location: calgary, AB
|
Unless I can write it in excel I don't have access to vb. I have Microsoft Developer Studio 97 for c++ and I have matlab. I have taken two programming courses, the first two were manditory engineering courses, which marked the first time I had ever programmed. Since I've started working I become pretty good with vb however I have managed to forget alot of the dynamic programming I once learned. The short of this is, I ain't got no freaking clue what your talking about. Is a vector like a dynamically allocated array? Also I still don't see how to access the lines of the file that I want to find, would that mean that split would split the array at the end of line terminators? I don't know if this means anything but the files are huge. (95mb < x < 160 mb)
Thanks for your attempts to help me so far but I'm going to need you to go somewhat more in depth. Last edited by nilodor : 03-07-2005 at 11:37 PM. |
03-07-2005, 11:42 PM | #10 | |
College Benchwarmer
Join Date: Oct 2000
Location: calgary, AB
|
Quote:
If spaces count as characters than yes, yes it does! |
|
03-07-2005, 11:50 PM | #11 | |
General Manager
Join Date: Oct 2000
Location: The Satellite of Love
|
Quote:
Spaces count. Well, in that case you should be able to determine at what position the 900000 line starts in by using a simple match function. Basically, you need to determine how many characters are in each line (including the "invisible" ones, ie the one that determines there is a new line for instance) (Use a Hex Editor to figure that out quickly). Then, take that number and multiply it by sizeof(char) and that will give you the data size of each line in the file. Take that number, multiply it by 900000 (or maybe 899999 since the starting position of the first line is 0) and that should give you the position in the file where the 900000th line begins. Then just send the file pointer there and that should be it. EDIT: Here's how it would kind of look like using made up numbers.... Code:
Something like that. Sorry if I made any obvious errors (read: Check my work!) but I've had about 5 or so hours of sleep in the last two days. I'm a little "punch drunk" right now. Last edited by sabotai : 03-07-2005 at 11:59 PM. |
|
03-08-2005, 05:24 AM | #12 |
SI Games
Join Date: Oct 2000
Location: Melbourne, FL
|
If it doesn't have to be part of an embedded C++ program then take a quick look at 'awk' its a scripting language designed for use in text parsing, its a god send for such things and can make rather complicated operations extremely simple*.
*First time I used it was for parsing symbol table dumps of a flight simulator and decompiling them into structure and variable information with offsets from a base memory address, I'd estimated that it'd have taken me around a month to program and debug a C/C++ program to do the work - in awk it took me under a week. |
03-08-2005, 05:25 AM | #13 |
SI Games
Join Date: Oct 2000
Location: Melbourne, FL
|
PS> and that week included learning the basics of the language
|
03-08-2005, 09:42 AM | #14 |
College Starter
Join Date: Oct 2000
Location: Berkeley
|
Does C++ support regular expressions? I've used them in vbscript on 200MB text files and it went pretty fast... you could use that if the data you're looking for fits a pre-determined pattern. Then you don't need to worry about line numbers.
|
03-09-2005, 08:39 PM | #15 | |
Pro Starter
Join Date: Jul 2003
Location: South Bend, IN
|
Quote:
Blind adherence to OO = bad.
__________________
Hattrick - Brays Bayou FC (70854) / USA III.4 Hockey Arena - Houston Aeros / USA II.1 Thanks to my FOFC Hattrick supporters - Blackout, Brillig, kingfc22, RPI-fan, Rich1033, antbacker, One_to7, ur_land, KevinNU7, and TonyR (PM me if you support me and I've missed you) |
|
03-14-2005, 01:38 PM | #16 |
College Benchwarmer
Join Date: Oct 2000
Location: calgary, AB
|
So I figured out how to do this in vba. If anyone is interested in looking at the code here is the important parts. It is a pretty dirty way to do it, but what can you do?
pctdone = 0 With UserForm1 .Label1.Caption = "Initializing" .FrameProgress.Caption = Format(pctdone, "0%") .LabelProgress.Width = pctdone * (.FrameProgress.Width - 10) End With ' The DoEvents statement is responsible for the form updating DoEvents bob = "confy" ' File that contains the name of the files to open file1 = ActiveWorkbook.Path & "\" & bob & ".txt" Set fs = CreateObject("Scripting.FileSystemObject") ' object to interact with the files Set filer = fs.opentextfile(file1) ' open conf.txt piper = filer.readline ' read in the name of the files filer.Close Sheets("Displacement").Select Set filer = fs.opentextfile(ActiveWorkbook.Path & "\" & piper & ".pipdis") ' Open Pipe Displacements For i = 1 To 2584 ' Skip to OCT 1 entry filer.skipline Next i pctdone = 0.01 With UserForm1 .Label1.Caption = "Finding Year 1 Pipe Displacements" .FrameProgress.Caption = Format(pctdone, "0%") .LabelProgress.Width = pctdone * (.FrameProgress.Width - 10) End With ' The DoEvents statement is responsible for the form updating DoEvents For j = 1 To 136 filer.Skip (47) ' Skip to the y-displacement column Cells([j], [8]).Value = filer.read(12) ' read the data in and paste it in a spreadsheet filer.skipline Next j |
03-14-2005, 02:42 PM | #17 | |
General Manager
Join Date: Oct 2000
Location: The Satellite of Love
|
Quote:
Hey, as long as it works. |
|
03-14-2005, 09:31 PM | #18 | |
College Starter
Join Date: Oct 2000
Location: Calgary
|
Quote:
I'll inevitably write 20 lines to his five to accomplish the same thing. At some point my design will have to become slicker but I guess it's just experience. I hate the way some of his functions bunch everything into one line; for instance, in a bool he might do something like: Code:
I like to return one variable and no doubt would have expanded the code with one or more if statements. Not cool enough I guess... Last edited by Karim : 03-14-2005 at 09:31 PM. |
|
03-14-2005, 10:33 PM | #19 |
Awaiting Further Instructions...
Join Date: Nov 2001
Location: Macungie, PA
|
Karim,
I've found that the number of lines is meaningless if it creates code that is uber effecient but hard for another (or even yourself in a few months) to figure out. When I code I am very structured. For e.g., my for loops are like: for(x=1;x<10;x++){ //stuff here } and are properly tabbed. Even simple programs can contain hundreds of lines of code and if they aren't easy to read and navigate, forget it. |
03-14-2005, 10:48 PM | #20 |
General Manager
Join Date: Oct 2000
Location: The Satellite of Love
|
Having all of those function calls on one line is not "uber efficient". The increase in efficientcy, unless it's called a million times (literally), is pretty negligable. In my opinion, it's sloppy and very hard to read.
Sure, you can be the "cool guy" by pushing 5 functions calls in an if statement, but in the end all you've done is saved your processor a couple of assignment calls (completely negligable unless it's being called very frequently) and made it so everyone reading your code has to stop on that line and look it over for a good minute. Imagine if you're on a team and someone coded 100 lines or so in a project you're working on but did it like that. Your team would have to spend a night looking over and tracing his code just figure out what it's doing. Even if he comments the hell out of it, that's a lot of reading to do. IMO, the best code is as self-explainitory as possible. One line, one function. (That's my motto. ) And I go even farther than Bonegavel. I put that beginning { on the line below the statement for (x = 1; x < 10; x++) { // stuff } (indented properly, and I don't like it bunched up. I space it out some) Last edited by sabotai : 03-14-2005 at 10:49 PM. |
03-14-2005, 11:54 PM | #21 |
College Prospect
Join Date: Oct 2000
Location: Baltimore, MD
|
I also prefer the { on the line following, and never put it on the first line.
I also always use brackets with loops/if statements, even when not needed. ie.. if (x != y) { y++; } as opposed to if (x != y) y++; just personal preference, I guess |
03-15-2005, 04:30 PM | #22 | |
College Starter
Join Date: Oct 2000
Location: Calgary
|
Quote:
That's another thing he *doesn't* do. I like a lot of white space in my code and considering I can't trace in my head like he can, I need all the help I can get. Glad to see his code style isn't particularly favoured... |
|
03-15-2005, 04:46 PM | #23 |
General Manager
Join Date: Oct 2000
Location: The Satellite of Love
|
I do the same as Raven. I always include the braces, even if I don't need them. Helps a lot with readability.
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
|
|