Tuesday, June 15, 2010

Working with Excel in .Net

There are two ways of reading and writing Excel. However, there are some differences between each of them. Sometime i found this helpful but sometime the other is better, so let me try to illustrate two ways in reading Excel file and their pros and cons.
  1. Use ADO.NET to Retrieve and Modify Records in an Excel Workbook
      • Pros:  
        • Easy to use with ADO.net objects
        • Easy in loading data into Dataset or Datable, Once we have a Datable, we can use Select method to filtering data.
        • Easy in querying data by using SQL Select Statement
        • Easy inserting or modifying data by using DataAdapter with SQL Insert and Update statement.
      • Cons:
        • Provide only basic features with basic purpose (Retrieve and Modify data only)
        • Not able to format Excel Cell or Range
        • There is a bug when writing or modifying data in Text type. Text value will be preceded by a single quote (').
        • In some cases, it cannot read the value of some Cells. It always retrieve Null value, but in fact, the value do exist in the Cells (It happens mostly in formatted Cells)
        • Not flexible regarding to Table Name. It's required to know Excel's SheetName as TableName to query from.
      •  Connection string:
        • Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Book1.xls;Extended Properties="Excel 8.0;HDR=YES;"
          • Excel 8.0 work for Microsoft Excel 8.0 (97), 9.0 (2000), 10.0 (2002) and also 11.0 (2003)
        • Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myOldExcelFile.xls;Extended Properties="Excel 12.0;HDR=YES";
          • Excel 97-2007 Xls files with ACE OLEDB 12.0, this connection string to use the Office 2007 OLEDB driver
          • Note: 
            The quota " in the string needs to be escaped using your language specific escape syntax.
            c#, c++   \"
            VB6, VBScript   ""
            xml (web.config etc)   "
            or maybe use a single quota '.

            "HDR=Yes;" indicates that the first row contains columnnames, not data. "HDR=No;" indicates the opposite.

            "IMEX=1;" tells the driver to always read "intermixed" (numbers, dates, strings etc) data columns as text. Note that this option might affect excel sheet write access negative.
      • Table Naming Conventions
        There are several ways you can reference a table (or range) in an Excel workbook:
        • Use the sheet name followed by a dollar sign (for example, [Sheet1$] or [My Worksheet$]). A workbook table that is referenced in this manner includes the whole used range of the worksheet.


          Select * from [Sheet1$] 
        • Use a range with a defined name (for example, [MyNamedRange]):


          Select * from [MyNamedRange] 
        • Use a range with a specific address (for example, [Sheet1$A1:B10]):


          Select * from [Sheet1$A1:B10]
      • Add and Update Records
        • Directly run a command to insert or update records: To do this, you can create an OLEDbCommand object on your connection and set its CommandText property to a valid command to insert records:
      • INSERT INTO [Sheet1$] (F1, F2) values ('111', 'ABC')
      • UPDATE [Sheet1$] SET F2 = 'XYZ' WHERE F1 = '111'
        • Make Changes In Dataset/Datable and Use OleDbDataAdapter:
      • Set OleDbDataAdapter's InsertCommand to: INSERT INTO [Sheet1$] (F1, F2) values (?, ?)
      • Set OleDbDataAdapter's UpdateCommand to: UPDATE [Sheet1$] SET F2 = ? WHERE F1 = ?
        • Export data from another data source into an Excel workbook
          •  Use OLEDbCommand object, set CommandText property to: INSERT INTO [Sheet1$] IN 'C:\Book1.xls' 'Excel 8.0;' SELECT * FROM MyTable" or SELECT * INTO [Excel 8.0;Database=C:\Book1.xls].[Sheet1] FROM [MyTable]  
      • Create Workbooks and Tables
        •  To create a table in an Excel workbook, run the CREATE TABLE command: 
          • CREATE Tjavascript:void(0)ABLE Sheet1 (F1 char(255), F2 char(255))
      • Sample Code
      • Private m_sConn1 As String = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
                       "Data Source=C:\ExcelData1.xls;" & _
                       "Extended Properties=""Excel 8.0;HDR=YES"""
        
        Private m_sConn2 As String = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
                       "Data Source=C:\ExcelData2.xls;" & _
                       "Extended Properties=""Excel 8.0;HDR=YES"""
        
        Private m_sNorthwind = _
              "C:\Program Files\Microsoft Office\Office10\Samples\Northwind.mdb"
        
        Private m_sAction As String
        
        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
              RadioButton1.Text = "Create_Workbook"
              RadioButton2.Text = "Retrieve_Records"
              RadioButton3.Text = "Add_Records"
              RadioButton4.Text = "Update_Records"
              RadioButton5.Text = "Update_Individual_Cells"
              RadioButton6.Text = "Use_External_Source"
              Button1.Text = "Go!"
        End Sub
        
        Private Sub RadioButtons_Click(ByVal sender As Object, ByVal e As System.EventArgs) _
              Handles RadioButton1.Click, RadioButton2.Click, RadioButton3.Click, _
              RadioButton4.Click, RadioButton5.Click, RadioButton6.Click
              m_sAction = sender.Text'Store the text for the selected radio button
        End Sub
        
        Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
              Try
                 ' Call the associated routine to add/update/modify the workbook.
                 Select Case m_sAction
                    Case "Create_Workbook" : Create_Workbook()
                    Case "Retrieve_Records" : Retrieve_Records()
                    Case "Add_Records" : Add_Records()
                    Case "Update_Records" : Update_Records()
                    Case "Update_Individual_Cells" : Update_Individual_Cells()
                    Case "Use_External_Source" : Use_External_Source()
                 End Select
        
              Catch ex As OleDbException
                 Dim er As OleDbError
                 For Each er In ex.Errors
                    MsgBox(er.Message)
                 Next
              Catch ex2 As System.InvalidOperationException
                 MsgBox(ex2.Message)
              End Try
        
        
        End Sub
        
        Public Sub Create_Workbook()
        
              ' If the workbooks already exist, prompt to delete.
              Dim answer As MsgBoxResult
              If Dir("C:\ExcelData1.xls") <> "" Or Dir("C:\ExcelData2.xls") <> "" Then
                  answer = MsgBox("Delete existing workbooks (C:\ExcelData1.xls and " & _
                           "C:\ExcelData2.xls)?", MsgBoxStyle.YesNo)
                  If answer = MsgBoxResult.Yes Then
                      If Dir("C:\ExcelData1.xls") <> "" Then Kill("C:\ExcelData1.xls")
                      If Dir("C:\ExcelData2.xls") <> "" Then Kill("C:\ExcelData2.xls")
                  Else
                      Exit Sub
                  End If
              End If
        
        '==========================================================================
              ' Create a workbook with a table named EmployeeData. The table has 3 
              ' fields: ID (char 255), Name (char 255) and Birthdate (date).  
        '==========================================================================
              Dim conn As New OleDbConnection()
              conn.ConnectionString = m_sConn1
              conn.Open()
              Dim cmd1 As New OleDbCommand()
              cmd1.Connection = conn
              cmd1.CommandText = "CREATE TABLE EmployeeData (Id char(255), Name char(255), BirthDate date)"
              cmd1.ExecuteNonQuery()
              cmd1.CommandText = "INSERT INTO EmployeeData (Id, Name, BirthDate) values ('AAA', 'Andrew', '12/4/1955')"
              cmd1.ExecuteNonQuery()
              conn.Close()
        
        '==========================================================================
              ' Create a workbook with a table named InventoryData. The table has 3 
              ' fields: Product (char 255), Qty (float) and Price (currency). 
        '==========================================================================
        
              conn.ConnectionString = m_sConn2
              conn.Open()
              Dim cmd2 As New OleDbCommand()
              cmd2.Connection = conn
              cmd2.CommandText = "CREATE TABLE InventoryData (Product char(255), Qty float, Price currency)"
              cmd2.ExecuteNonQuery()
              cmd2.CommandText = "INSERT INTO InventoryData (Product, Qty, Price) values ('Cola', 200, 1.35)"
              cmd2.ExecuteNonQuery()
              cmd2.CommandText = "INSERT INTO InventoryData (Product, Qty, Price) values ('Chips', 550, 0.89)"
              cmd2.ExecuteNonQuery()
              conn.Close()
        
              ' NOTE: You can ALTER and DROP tables in a similar fashion.
        
        End Sub
        
        Public Sub Retrieve_Records()
        
              '==========================================================
              'Use a DataReader to read data from the EmployeeData table.
              '==========================================================
        
              Dim conn1 As New System.Data.OleDb.OleDbConnection(m_sConn1)
              conn1.Open()
              Dim cmd1 As New System.Data.OleDb.OleDbCommand("Select * From [EmployeeData$]", conn1)
              Dim rdr As OleDbDataReader = cmd1.ExecuteReader
        
              Debug.WriteLine(vbCrLf & "EmployeeData:" & vbCrLf & "=============")
              Do While rdr.Read()
                 Debug.WriteLine(System.String.Format("{0,-10}{1, -15}{2}", _
                    rdr.GetString(0), rdr.GetString(1), _
                    rdr.GetDateTime(2).ToString("d")))
              Loop
              rdr.Close()
              conn1.Close()
        
              '========================================================
              'Use a DataSet to read data from the InventoryData table.
              '========================================================
              Dim conn2 As New OleDbConnection(m_sConn2)
              Dim da As New OleDbDataAdapter("Select * From [InventoryData$]", conn2)
              Dim ds As DataSet = New DataSet()
              da.Fill(ds)
              Debug.WriteLine(vbCrLf & "InventoryData:" & vbCrLf & "==============")
              Dim dr As DataRow
              For Each dr In ds.Tables(0).Rows'Show results in output window
                 Debug.WriteLine(System.String.Format("{0,-15}{1, -6}{2}", _
                    dr("Product"), dr("Qty"), dr("Price")))
              Next
              conn2.Close()
        
        End Sub
        
        Public Sub Add_Records()
        
        '==========================================================================
              ' Run an INSERT INTO command to add new records to the workbook. 
        '==========================================================================
              Dim conn1 As New System.Data.OleDb.OleDbConnection(m_sConn1)
              conn1.Open()
              Dim cmd As New System.Data.OleDb.OleDbCommand()
              cmd.Connection = conn1
              cmd.CommandText = "INSERT INTO [EmployeeData$] (ID, Name, BirthDate) values ('CCC', 'Charlie', '10/14/48')"
              cmd.ExecuteNonQuery()
              cmd.CommandText = "INSERT INTO [EmployeeData$] (ID, Name, BirthDate) values ('DDD', 'Deloris', '7/19/98')"
              cmd.ExecuteNonQuery()
              conn1.Close()
        
              '====================================================================
              'Use the InsertCommand object to add new records to the InventoryData
              'table.
              '====================================================================
              Dim conn2 As New OleDbConnection(m_sConn2)
              Dim da As New OleDbDataAdapter("Select * From [InventoryData$]", conn2)
              Dim ds As DataSet = New DataSet()
              da.Fill(ds, "MyExcelTable")
        
              ' Generate the InsertCommand and add the parameters for the command.
              da.InsertCommand = New OleDbCommand( _
                 "INSERT INTO [InventoryData$] (Product, Qty, Price) VALUES (?, ?, ?)", conn2)
              da.InsertCommand.Parameters.Add("@Product", OleDbType.VarChar, 255, "Product")
              da.InsertCommand.Parameters.Add("@Qty", OleDbType.Double).SourceColumn = "Qty"
              da.InsertCommand.Parameters.Add("@Price", OleDbType.Currency).SourceColumn = "Price"
        
              ' Add two new records to the dataset.
              Dim dr As DataRow
              dr = ds.Tables(0).NewRow
              dr("Product") = "Bread" : dr("Qty") = 390 : dr("Price") = 1.89 : ds.Tables(0).Rows.Add(dr)
              dr = ds.Tables(0).NewRow
              dr("Product") = "Milk" : dr("Qty") = 99 : dr("Price") = 2.59 : ds.Tables(0).Rows.Add(dr)
        
              ' Apply the dataset changes to the actual data source (the workbook).
              da.Update(ds, "MyExcelTable")
              conn2.Close()
        
        End Sub
        
        Public Sub Update_Records()
        
        '==========================================================================
              ' Run an UPDATE command to change a record in the EmployeeData
              ' table.
        '==========================================================================
              Dim conn1 As New System.Data.OleDb.OleDbConnection(m_sConn1)
              conn1.Open()
              Dim cmd As New System.Data.OleDb.OleDbCommand()
              cmd.Connection = conn1
              cmd.CommandText = "UPDATE [EmployeeData$] " & _
                            "SET NAME = 'Aaron', BirthDate = '5/4/1975' WHERE ID = 'AAA'"
              cmd.ExecuteNonQuery()
              conn1.Close()
        
              '====================================================================
              ' Use the UpdateCommand object to modify records in the InventoryData
              ' table.
              '====================================================================
              Dim conn2 As New OleDbConnection(m_sConn2)
              Dim da As New OleDbDataAdapter("Select * From [InventoryData$]", conn2)
              Dim ds As DataSet = New DataSet()
              da.Fill(ds, "MyInventoryTable")
        
              ' Generate the UpdateCommand and add the parameters for the command.
              da.UpdateCommand = New OleDbCommand( _
                 "UPDATE [InventoryData$] SET Qty = ?, Price=? WHERE Product = ?", conn2)
              da.UpdateCommand.Parameters.Add("@Qty", OleDbType.Numeric).SourceColumn = "Qty"
              da.UpdateCommand.Parameters.Add("@Price", OleDbType.Currency).SourceColumn = "Price"
              da.UpdateCommand.Parameters.Add("@Product", OleDbType.VarChar, 255, "Product")
        
              ' Update the first two records.
              ds.Tables(0).Rows(0)("Qty") = 1000
              ds.Tables(0).Rows(0)("Price") = 10.1
              ds.Tables(0).Rows(1)("Qty") = 2000
              ds.Tables(0).Rows(1)("Price") = 20.2
        
              ' Apply the dataset changes to the actual data source (the workbook).
              da.Update(ds, "MyInventoryTable")
              conn2.Close()
        
        End Sub
        
        Public Sub Update_Individual_Cells()
        
        '==========================================================================
              ' Update individual cells on the EmployeeData worksheet; 
              ' specifically, cells F3, G3, and I4 are modified.
        '==========================================================================
        
              ' NOTE: The connection string indicates that the table does *NOT* 
              ' have a header row.
              Dim conn As New System.Data.OleDb.OleDbConnection(m_sConn1.Replace("HDR=YES", "HDR=NO"))
              conn.Open()
              Dim cmd As New System.Data.OleDb.OleDbCommand()
              cmd.Connection = conn
              cmd.CommandText = "UPDATE [EmployeeData$F3:G3] SET F1 = 'Cell F3', F2 = 'Cell G3'"
              cmd.ExecuteNonQuery()
              cmd.CommandText = "UPDATE [EmployeeData$I4:I4] SET F1 = 'Cell I4'"
              cmd.ExecuteNonQuery()
              conn.Close()
        
        End Sub
        
        Public Sub Use_External_Source()
        
              ' Open a connection to the sample Northwind Access database.
              Dim conn As New System.Data.OleDb.OleDbConnection( _
                    "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & m_sNorthwind & ";")
              conn.Open()
              Dim cmd As New System.Data.OleDb.OleDbCommand()
              cmd.Connection = conn
        
        '=======================================================================
              ' Run an INSERT..INTO command on the Northwind database to append 
              ' the records from a table/query to an existing table in the Excel 
              ' workbook.
        '=======================================================================
              cmd.CommandText = "INSERT INTO [EmployeeData$] IN 'C:\ExcelData1.xls' 'Excel 8.0;'" & _
                 "SELECT EmployeeID AS ID, FirstName AS Name, BirthDate FROM Employees"
              cmd.ExecuteNonQuery()
        
        '==========================================================================
              ' Run a SELECT..INTO command on the Northwind database to insert 
              ' all the records from a table/query into a new sheet in the Excel 
              ' workbook.
        '==========================================================================
              cmd.CommandText = "SELECT * INTO [Excel 8.0;Database=C:\ExcelData2.xls].[ProductSales]" & _
                              "FROM [Product Sales for 1997]"
              cmd.ExecuteNonQuery()
        
              conn.Close()
        
        End Sub 
         
      • Extra Source Code:
        • Get Excel SheetName: 
        • private String[] GetExcelSheetNames(string excelFile)
          {
            OleDbConnection objConn = null;
            System.Data.DataTable dt = null;
          
            try
            {
              // Connection String. Change the excel file to the file you
          
              // will search.
          
              String connString = "Provider=Microsoft.Jet.OLEDB.4.0;" + 
                  "Data Source=" + excelFile + ";Extended Properties=Excel 8.0;";
              // Create connection object by using the preceding connection string.
          
              objConn = new OleDbConnection(connString);
              // Open connection with the database.
          
              objConn.Open();
              // Get the data table containg the schema guid.
          
              dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
           
              if(dt == null)
              {
                return null;
              }
          
              String[] excelSheets = new String[dt.Rows.Count];
              int i = 0;
          
              // Add the sheet name to the string array.
          
              foreach(DataRow row in dt.Rows)
              {
                excelSheets[i] = row["TABLE_NAME"].ToString();
                i++;
              }
          
              // Loop through all of the sheets if you want too...
          
              for(int j=0; j < excelSheets.Length; j++)
              {
                // Query each excel sheet.
          
              }
          
              return excelSheets;
            }
            catch(Exception ex)
            {
              return null;
            }
            finally
            {
              // Clean up.
          
              if(objConn != null)
              {
                objConn.Close();
                objConn.Dispose();
              }
              if(dt != null)
              {
                dt.Dispose();
              }
            }
          }
          References: 
          http://support.microsoft.com/kb/316934
          http://www.connectionstrings.com/excel 
  2. Using Excel Object Library

No comments: