Front Office Football Central  

Go Back   Front Office Football Central > Archives > FOFC Archive
Register FAQ Members List Calendar Mark Forums Read Statistics

Reply
 
Thread Tools
Old 03-07-2005, 10:39 PM   #1
nilodor
College Benchwarmer
 
Join Date: Oct 2000
Location: calgary, AB
C++ Help

Need a little help here, it has been a while since I have programmed in c++ and I could really use the expertise here.

Here is the problem, I have several files of around a million lines. What I need to do is skip to a certain line and read in 135 lines of data. ie I want to skip to line 900000 and read in lines 900000 - 900134 then skip to line 950000 and read in, you get the idea.

I have written something using getline in a while loop with an if statement but it is brutally inefficient. Does anyone here have some advice on what I should do? I have searched the internet for a while but all the solutions that appear helpful are at pay only sites. There had better not be a goto command or I'm gonna rip my hair out for not figuring it out.

nilodor is offline   Reply With Quote
Old 03-07-2005, 10:40 PM   #2
GoldenEagle
Grizzled Veteran
 
Join Date: Dec 2002
Location: Little Rock, AR
Procedural Programming = bad
__________________
Xbox 360 Gamer Tag: GoldenEagle014
GoldenEagle is offline   Reply With Quote
Old 03-07-2005, 10:48 PM   #3
nilodor
College Benchwarmer
 
Join Date: Oct 2000
Location: calgary, AB
Quote:
Originally Posted by GoldenEagle
Procedural Programming = bad

Do you mean that going for a certain line and hard coding it that way is bad? I know exactly which lines of the file the data will be in if that makes things easier.
nilodor is offline   Reply With Quote
Old 03-07-2005, 10:58 PM   #4
GoldenEagle
Grizzled Veteran
 
Join Date: Dec 2002
Location: Little Rock, AR
Well, with object-orientated programing, you just would call what classes and objects that you wanted from your driver program. it would take you awhile to rewrite everything, but you may have to do it.
__________________
Xbox 360 Gamer Tag: GoldenEagle014
GoldenEagle is offline   Reply With Quote
Old 03-07-2005, 11:07 PM   #5
Daimyo
College Starter
 
Join Date: Oct 2000
Location: Berkeley
I don't think what he's doing has anything to do with object oriented vs procedural. Unless I'm misunderstanding, his program needs to read in a ~million line text file and get some info that is always on lines n through n + 135.

I have limited experience with C++, but with VB you could read the whole file into a string and use split (with vbCrLf as the delimitter) to break it into an array from which you could then pull directly whatever lines you needed. Not sure how efficient that would be for very large blocks of text though.

Last edited by Daimyo : 03-07-2005 at 11:10 PM.
Daimyo is offline   Reply With Quote
Old 03-07-2005, 11:14 PM   #6
GoldenEagle
Grizzled Veteran
 
Join Date: Dec 2002
Location: Little Rock, AR
Ah, color me silly. I misread what you said. Thanks for point that out Daimyo. I figured you were doing some sort of clean up of an old program written a long time ago.

Give me a minute, I am sure I can come up with something.
__________________
Xbox 360 Gamer Tag: GoldenEagle014
GoldenEagle is offline   Reply With Quote
Old 03-07-2005, 11:20 PM   #7
GoldenEagle
Grizzled Veteran
 
Join Date: Dec 2002
Location: Little Rock, AR
Quote:
Originally Posted by Daimyo
I don't think what he's doing has anything to do with object oriented vs procedural. Unless I'm misunderstanding, his program needs to read in a ~million line text file and get some info that is always on lines n through n + 135.

I have limited experience with C++, but with VB you could read the whole file into a string and use split (with vbCrLf as the delimitter) to break it into an array from which you could then pull directly whatever lines you needed. Not sure how efficient that would be for very large blocks of text though.
Working with his suggestion, you could store the text into a vector and then access the data which would be more efficent than using an array. This way, you would not have to define the array either.
__________________
Xbox 360 Gamer Tag: GoldenEagle014

Last edited by GoldenEagle : 03-07-2005 at 11:20 PM.
GoldenEagle is offline   Reply With Quote
Old 03-07-2005, 11:35 PM   #8
sabotai
General Manager
 
Join Date: Oct 2000
Location: The Satellite of Love
This is a problem with text files. There's no simple "seek line 900000 of 'thistextfile.txt'" type of function in C++. It sure would be simple to be able to type in textfile.seek(900000) and have it go right there. Any kind of function that sends you to a spot in a text file is going to be reading in each line to get you there anyway because simply, it has to. To my knowledge (and I've looked), there simply is no other way.

When it comes to text files, you just have to accept "brutally inefficient" or find a different way of storing data.

EDIT: Unless each and every line consists of the same number of characters, that is.

Last edited by sabotai : 03-07-2005 at 11:37 PM.
sabotai is offline   Reply With Quote
Old 03-07-2005, 11:35 PM   #9
nilodor
College Benchwarmer
 
Join Date: Oct 2000
Location: calgary, AB
Unless I can write it in excel I don't have access to vb. I have Microsoft Developer Studio 97 for c++ and I have matlab. I have taken two programming courses, the first two were manditory engineering courses, which marked the first time I had ever programmed. Since I've started working I become pretty good with vb however I have managed to forget alot of the dynamic programming I once learned. The short of this is, I ain't got no freaking clue what your talking about. Is a vector like a dynamically allocated array? Also I still don't see how to access the lines of the file that I want to find, would that mean that split would split the array at the end of line terminators? I don't know if this means anything but the files are huge. (95mb < x < 160 mb)

Thanks for your attempts to help me so far but I'm going to need you to go somewhat more in depth.

Last edited by nilodor : 03-07-2005 at 11:37 PM.
nilodor is offline   Reply With Quote
Old 03-07-2005, 11:42 PM   #10
nilodor
College Benchwarmer
 
Join Date: Oct 2000
Location: calgary, AB
Quote:
Originally Posted by sabotai
EDIT: Unless each and every line consists of the same number of characters, that is.

If spaces count as characters than yes, yes it does!
nilodor is offline   Reply With Quote
Old 03-07-2005, 11:50 PM   #11
sabotai
General Manager
 
Join Date: Oct 2000
Location: The Satellite of Love
Quote:
Originally Posted by nilodor
If spaces count as characters than yes, yes it does!

Spaces count.

Well, in that case you should be able to determine at what position the 900000 line starts in by using a simple match function. Basically, you need to determine how many characters are in each line (including the "invisible" ones, ie the one that determines there is a new line for instance) (Use a Hex Editor to figure that out quickly). Then, take that number and multiply it by sizeof(char) and that will give you the data size of each line in the file. Take that number, multiply it by 900000 (or maybe 899999 since the starting position of the first line is 0) and that should give you the position in the file where the 900000th line begins. Then just send the file pointer there and that should be it.

EDIT: Here's how it would kind of look like using made up numbers....

Code:
// The Number of Characters that is in each line, including the "invisible" ones. const NumChars = 40; // Declare our variables int iSizeOfLine; int iStartPosition; // Determine where the 900,000th line starts iSizeOfLine = NumChars * sizeof(char); iStartPosition = iSizeOfLine * 900000; // Open the file and send the file pointer to the beginning of the 900,000th line ifstream inputfile.open("ifile.txt"); inputfile.seek(iStartPosition, ios::beg);

Something like that. Sorry if I made any obvious errors (read: Check my work!) but I've had about 5 or so hours of sleep in the last two days. I'm a little "punch drunk" right now.

Last edited by sabotai : 03-07-2005 at 11:59 PM.
sabotai is offline   Reply With Quote
Old 03-08-2005, 05:24 AM   #12
Marc Vaughan
SI Games
 
Join Date: Oct 2000
Location: Melbourne, FL
If it doesn't have to be part of an embedded C++ program then take a quick look at 'awk' its a scripting language designed for use in text parsing, its a god send for such things and can make rather complicated operations extremely simple*.

*First time I used it was for parsing symbol table dumps of a flight simulator and decompiling them into structure and variable information with offsets from a base memory address, I'd estimated that it'd have taken me around a month to program and debug a C/C++ program to do the work - in awk it took me under a week.
Marc Vaughan is offline   Reply With Quote
Old 03-08-2005, 05:25 AM   #13
Marc Vaughan
SI Games
 
Join Date: Oct 2000
Location: Melbourne, FL
PS> and that week included learning the basics of the language
Marc Vaughan is offline   Reply With Quote
Old 03-08-2005, 09:42 AM   #14
Daimyo
College Starter
 
Join Date: Oct 2000
Location: Berkeley
Does C++ support regular expressions? I've used them in vbscript on 200MB text files and it went pretty fast... you could use that if the data you're looking for fits a pre-determined pattern. Then you don't need to worry about line numbers.
Daimyo is offline   Reply With Quote
Old 03-09-2005, 08:39 PM   #15
Mr. Wednesday
Pro Starter
 
Join Date: Jul 2003
Location: South Bend, IN
Quote:
Originally Posted by GoldenEagle
Procedural Programming = bad
Um, no. Right tool for the right job, and that sort of thing.

Blind adherence to OO = bad.
__________________
Hattrick - Brays Bayou FC (70854) / USA III.4
Hockey Arena - Houston Aeros / USA II.1

Thanks to my FOFC Hattrick supporters - Blackout, Brillig, kingfc22, RPI-fan, Rich1033, antbacker, One_to7, ur_land, KevinNU7, and TonyR (PM me if you support me and I've missed you)
Mr. Wednesday is offline   Reply With Quote
Old 03-14-2005, 01:38 PM   #16
nilodor
College Benchwarmer
 
Join Date: Oct 2000
Location: calgary, AB
So I figured out how to do this in vba. If anyone is interested in looking at the code here is the important parts. It is a pretty dirty way to do it, but what can you do?


pctdone = 0
With UserForm1
.Label1.Caption = "Initializing"
.FrameProgress.Caption = Format(pctdone, "0%")
.LabelProgress.Width = pctdone * (.FrameProgress.Width - 10)
End With
' The DoEvents statement is responsible for the form updating
DoEvents


bob = "confy" ' File that contains the name of the files to open
file1 = ActiveWorkbook.Path & "\" & bob & ".txt"
Set fs = CreateObject("Scripting.FileSystemObject") ' object to interact with the files
Set filer = fs.opentextfile(file1) ' open conf.txt
piper = filer.readline ' read in the name of the files


filer.Close

Sheets("Displacement").Select
Set filer = fs.opentextfile(ActiveWorkbook.Path & "\" & piper & ".pipdis") ' Open Pipe Displacements
For i = 1 To 2584 ' Skip to OCT 1 entry
filer.skipline
Next i

pctdone = 0.01
With UserForm1
.Label1.Caption = "Finding Year 1 Pipe Displacements"
.FrameProgress.Caption = Format(pctdone, "0%")
.LabelProgress.Width = pctdone * (.FrameProgress.Width - 10)
End With
' The DoEvents statement is responsible for the form updating
DoEvents


For j = 1 To 136
filer.Skip (47) ' Skip to the y-displacement column
Cells([j], [8]).Value = filer.read(12) ' read the data in and paste it in a spreadsheet
filer.skipline
Next j
nilodor is offline   Reply With Quote
Old 03-14-2005, 02:42 PM   #17
sabotai
General Manager
 
Join Date: Oct 2000
Location: The Satellite of Love
Quote:
Originally Posted by nilodor
So I figured out how to do this in vba. If anyone is interested in looking at the code here is the important parts. It is a pretty dirty way to do it, but what can you do?

Hey, as long as it works.
sabotai is offline   Reply With Quote
Old 03-14-2005, 09:31 PM   #18
Karim
College Starter
 
Join Date: Oct 2000
Location: Calgary
Quote:
Originally Posted by sabotai
Hey, as long as it works.
Unless you're a stickler for efficiency like my instructor...

I'll inevitably write 20 lines to his five to accomplish the same thing. At some point my design will have to become slicker but I guess it's just experience.

I hate the way some of his functions bunch everything into one line; for instance, in a bool he might do something like:

Code:
bool function (parameters) { return (!(first call to function || second call to function || third call to function)) }

I like to return one variable and no doubt would have expanded the code with one or more if statements. Not cool enough I guess...

Last edited by Karim : 03-14-2005 at 09:31 PM.
Karim is offline   Reply With Quote
Old 03-14-2005, 10:33 PM   #19
Bonegavel
Awaiting Further Instructions...
 
Join Date: Nov 2001
Location: Macungie, PA
Karim,

I've found that the number of lines is meaningless if it creates code that is uber effecient but hard for another (or even yourself in a few months) to figure out.

When I code I am very structured. For e.g., my for loops are like:

for(x=1;x<10;x++){
//stuff here
}

and are properly tabbed.

Even simple programs can contain hundreds of lines of code and if they aren't easy to read and navigate, forget it.
Bonegavel is offline   Reply With Quote
Old 03-14-2005, 10:48 PM   #20
sabotai
General Manager
 
Join Date: Oct 2000
Location: The Satellite of Love
Having all of those function calls on one line is not "uber efficient". The increase in efficientcy, unless it's called a million times (literally), is pretty negligable. In my opinion, it's sloppy and very hard to read.

Sure, you can be the "cool guy" by pushing 5 functions calls in an if statement, but in the end all you've done is saved your processor a couple of assignment calls (completely negligable unless it's being called very frequently) and made it so everyone reading your code has to stop on that line and look it over for a good minute. Imagine if you're on a team and someone coded 100 lines or so in a project you're working on but did it like that. Your team would have to spend a night looking over and tracing his code just figure out what it's doing. Even if he comments the hell out of it, that's a lot of reading to do.

IMO, the best code is as self-explainitory as possible. One line, one function. (That's my motto. )

And I go even farther than Bonegavel. I put that beginning { on the line below the statement

for (x = 1; x < 10; x++)
{
// stuff
}

(indented properly, and I don't like it bunched up. I space it out some)

Last edited by sabotai : 03-14-2005 at 10:49 PM.
sabotai is offline   Reply With Quote
Old 03-14-2005, 11:54 PM   #21
Raven
College Prospect
 
Join Date: Oct 2000
Location: Baltimore, MD
I also prefer the { on the line following, and never put it on the first line.

I also always use brackets with loops/if statements, even when not needed.

ie..

if (x != y)
{
y++;
}

as opposed to

if (x != y)
y++;

just personal preference, I guess
Raven is offline   Reply With Quote
Old 03-15-2005, 04:30 PM   #22
Karim
College Starter
 
Join Date: Oct 2000
Location: Calgary
Quote:
Originally Posted by Raven
I also always use brackets with loops/if statements, even when not needed.

ie..

if (x != y)
{
y++;
}

as opposed to

if (x != y)
y++;

just personal preference, I guess

That's another thing he *doesn't* do. I like a lot of white space in my code and considering I can't trace in my head like he can, I need all the help I can get.

Glad to see his code style isn't particularly favoured...
Karim is offline   Reply With Quote
Old 03-15-2005, 04:46 PM   #23
sabotai
General Manager
 
Join Date: Oct 2000
Location: The Satellite of Love
I do the same as Raven. I always include the braces, even if I don't need them. Helps a lot with readability.
sabotai is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is On
Forum Jump


All times are GMT -5. The time now is 08:44 PM.



Powered by vBulletin Version 3.6.0
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.