PDA

View Full Version : Collate Data from csv files to excel sheet



dhiraj.ch185
02-22-2012, 01:33 AM
Hi,

I am having data in 7 .csv sheets(Worksheet1.csv,worksheet2.csv,.....) with different number of columns and rows in a folder(c:\Working_Collate). I am working on a macro to collate all the data from all these sheets into a single Collated.xlsx sheet. The main issue i am not able to handle is the row and column count is dynamic for different files. Can someone help me out on this.

Note:Here, i am keeping all the files open while executing the macro. Can i also skip this step and just give filepaths of all the csv files to make this easy.

:confused:

Thanks in advance,

-Dhiraj

Excel Fox
02-22-2012, 06:32 PM
Dhiraj,

This can be done from what you've highlighted.

Can you explain the layout of these csv files? Will all the files have column headings?? And will the first column (A) be filled all the time? If you want to collate your data one below the other, is it OK to have data with different columns?

dhiraj.ch185
02-22-2012, 06:53 PM
All my csv files have column headers. Some of them have same column header names too. Also i don't want (http://www.pcreview.co.uk/forums/do-you-combine-multiple-csv-files-into-one-file-t1739041.html) the data to be collated one below the other. They should be one beside other. So if i open the excel sheet, all the data from all the csv files should be aligned one beside other.

My basic idea is, open all the csv files, activate one of the Csv worksheet, copy from starting cell (a1) to ending of sheet.now activate excel and paste them in sheet1. Now increase the offset to next line. Now activate other csv worksheet and do the same as the first and so on. :)
Note: As we know, the csv files will have only one sheet when opened. Also the sheet name will be same as csv file's name.

-Dhiraj

Admin
02-22-2012, 07:15 PM
Hi Dhiraj,

Try this. You don't need to open the CSVs.


Dim dic As Object
Dim Counter As Long
Sub kTest()

Dim r As Long
Dim c As Long
Dim n As Long
Dim j As Long
Dim Fldr As String
Dim Fname As String
Dim wbkActive As Workbook
Dim wbkSource As Workbook
Dim Dest As Range
Dim d, k()

Application.ScreenUpdating = False
Counter = 0
With Application.FileDialog(4)
.Title = "Select the CSV folder"
.AllowMultiSelect = False
.Show
If .SelectedItems.Count Then
Fldr = .SelectedItems(1)
Else
GoTo Xit
End If
End With


Set dic = CreateObject("scripting.dictionary")
Set wbkActive = ThisWorkbook
ReDim k(1 To 50000, 1 To 100)
Set Dest = wbkActive.Worksheets("Sheet1").Range("a1") '<<==== adjust to suit
Fname = Dir(Fldr & "\*.csv")
Do While Len(Fname)
Set wbkSource = Workbooks.Open(Fldr & "\" & Fname)
With wbkSource.Worksheets(1)
d = .Range("a1").CurrentRegion
UniqueHeaders Application.Index(d, 1, 0)
For r = 2 To UBound(d, 1) 'skips header
If Len(d(r, 1)) Then
n = n + 1
For c = 1 To UBound(d, 2)
If Len(Trim$(d(1, c))) Then
j = dic.Item(Trim$(d(1, c)))
k(n, j) = d(r, c)
End If
Next
End If
Next
Erase d
End With
wbkSource.Close 0
Set wbkSource = Nothing
Fname = Dir()
Loop

If n Then
Dest.Resize(, dic.Count) = dic.keys
Dest.Offset(1).Resize(n, dic.Count) = k
MsgBox "Done"
End If
Xit:
Application.ScreenUpdating = True

End Sub
Private Sub UniqueHeaders(ByRef DataHeader)

Dim i As Long
Dim j As Long

With Application
j = .ScreenUpdating
.ScreenUpdating = False
End With

For i = LBound(DataHeader) To UBound(DataHeader)
If Len(Trim$(DataHeader(i))) Then
If Not dic.exists(Trim$(DataHeader(i))) Then
Counter = Counter + 1
dic.Add Trim$(DataHeader(i)), Counter
End If
End If
Next

Application.ScreenUpdating = j

End Sub

dhiraj.ch185
02-22-2012, 07:33 PM
Thanks for the try. But i am facing an issue at

For i = LBound(DataHeader) To UBound(DataHeader)
If Len(Trim$(DataHeader(i))) Then
If Not dic.exists(Trim$(DataHeader(i))) Then
Counter = Counter + 1
dic.Add Trim$(DataHeader(i)), Counter
End If
End If
Next

Application.ScreenUpdating = j

Throwing an error as "Object Required".

Regards,
-Dhiraj

Admin
02-22-2012, 07:47 PM
Hi

Sorry, I missed two lines in the beginning. I have edited the above code.

Also please use code tags while posting codes :)

dhiraj.ch185
02-22-2012, 08:22 PM
Oops... I see "Subscript out of range error" once the current csv sheet which will open initially is closed. May be switching to one more sheet has some problem.

Admin
02-22-2012, 09:32 PM
Hi,

How many rows will be there once we collate all the data ? Replace the 50000 if there are more rows.

dhiraj.ch185
02-22-2012, 10:09 PM
Not too many rows. You can try if you wanted to. I am attaching the folder zipped.

How about a simple approach like:



ub ActivateASheet()
Dim SShtName As String
Dim Wbk As Workbook
Dim Ssheet As Worksheet

For Each Wbk In Application.Workbooks
On Error GoTo Done
Wbk.Activate
On Error Resume Next
SShtName = Sheet1.Range("A1")
Set Ssheet = Sheets(SShtName)
If Not Ssheet Is Nothing Then
Ssheet.Activate
Exit Sub
End If
Next Wbk

Done:
ThisWorkbook.Activate
End Sub



Can you tweak it to work for my scenario.

Thanks,
Dhiraj

Admin
02-23-2012, 07:44 AM
Hi,

It works fine here.

dhiraj.ch185
02-25-2012, 03:56 PM
I still see the the "Subscript out of range error". Can you recommend any settings i should do on my excel 2007 to get this working? Because, if it is working there and not in my machine, there might be difference somewhere. I am using windows7 and Excel2007.

Cheers!

Admin
02-25-2012, 04:36 PM
Hi

In which line the error occurs ?

dhiraj.ch185
02-27-2012, 12:08 AM
Hi,

Script stops at " k(n, j) = d(r, c) " and shows Subscript out of range error. As i see, the script is opening a .csv file randomly from the folder i specify, and the first cell(A1) of that worksheet is active. Also the first cell(A1) in excel sheet is active too. I can send you a recorded webex session if you would like to have a look.

Thanks,
Dhiraj

Admin
02-27-2012, 07:35 AM
Hi Dhiraj,

Are you working on the real data or on the sample CSVs you posted here ?

Anyway replace the For Next statement with the following.


For c = 1 To UBound(d, 2)
If Len(Trim$(d(1, c))) Then
j = dic.Item(Trim$(d(1, c)))
On Error Resume Next
k(n, j) = d(r, c)
If Err.Number <> 0 Then
MsgBox "Total Columns so far " & dic.Count & vbLf & _
"Counter :" & n & vbLf & _
"Current Col:" & j
Err.Clear: On Error GoTo 0
Exit Sub
End If
End If
Next

Let me know what the MsgBox says once it error out.

dhiraj.ch185
03-06-2012, 05:59 PM
Hi,

Sorry for late reply. I see the MsgBox as "Total Columns so far 13, Counter : 0, Current Col: 1".
Also i have tried the same script on my original data and also the data which is uploaded earlier. But no luck :(

Thanks,
-Dhiraj

Admin
03-06-2012, 07:03 PM
Hi,

It works fine here. I'm not sure why it fails to work at your end. Anyway please find attached and try again.

dhiraj.ch185
03-06-2012, 07:37 PM
Wow.. This worked well. I can see the data is collated. But, if we observe the collated file, all the columns of files got collated well but the data is collated as one below the other.

I guess, i will look look further into the script.

Thank you.
-Dhiraj