Need a simple CRM and Project Management system?
Check out JobNimbus - CRM for Contractors and Service Professionals.

C# Free Component to Generate PDF - Convert HTML to PDF

I was trying different components and methods of generating a PDF dynamically using C# and ASP.NET. There are quite a few pay-for

components with prices ranging between $250 and $1,000+ for a license. These pay-for products do a great job and some of them can



generate very complex PDF documents. But I just needed something that would generate simple PDF documents without too much in the way

of formatting.

There are also quite a few open source projects that provide rudimentary support for PDF creation. Many are pretty limited in their

feature sets. It seems Adobe has really kept it's power position by making sure their PDF format is as complex as possible. There is

one component in particular that I tried that worked quite well for my needs and is completely free. It's called Pdfizer. You can

view the site for this project here:

http://sourceforge.net/projects/pdfizer

It was a little difficult finding an up-to-date code sample. There is a post on Code Project on how to use this component but it was

dated 2004 that you can read here:

http://www.codeproject.com/KB/cs/pdfizer.aspx

Unfortunately, the code shown in this article is out-dated and won't even compile anymore. After some playing around and a few

different tries, I was able to get things working. To get this code to work, first download the latest build from sourceforge.net or

you can download the files here just in case the link for sourceforge.net is unavailable:

Pdfizer.zip

Generate PDF from HTML

First, set a reference in your project to the 3 DLL's that Pdfizer uses. Here are the 3 dll names to set a reference to:

ICSharpCode.SharpZipLib.dll (This component is used to parse the HTML)

itextsharp.dll (This component is used by Pdfizer to create the PDF document)

Pdfizer.dll (This is the main component with the HtmlToPdf object that executes the conversion operations).

Now we can add some code to use this component. Here is the code to generate a PDF from some HTML specified:

// set a path to where you want to write the PDF to.
string sPathToWritePdfTo = @"C:\new_pdf_name.pdf";
 
// build some HTML text to write as a PDF.  You could also 
// read this HTML from a file or other means.
// NOTE: This component doesn't understand CSS or other 
// newer style HTML so you will need to use depricated 
// HTML formatting such as the <font> tag to make it look correct.
System.Text.StringBuilder sbHtml = new System.Text.StringBuilder();
sbHtml.Append("<html>");
sbHtml.Append("<body>");
sbHtml.Append("<font size='14'>My Document Title Line</font>");
sbHtml.Append("<br />");
sbHtml.Append("This is my document text");
sbHtml.Append("</body>");
sbHtml.Append("</html>");
 
// create file stream to PDF file to write to
using (System.IO.Stream stream = new System.IO.FileStream

(sPathToWritePdfTo, System.IO.FileMode.OpenOrCreate))
{
    // create new instance of Pdfizer
    Pdfizer.HtmlToPdfConverter htmlToPdf = new Pdfizer.HtmlToPdfConverter();
    // open stream to write Pdf to to
    htmlToPdf.Open(stream);
    // write the HTML to the component
    htmlToPdf.Run(sbHtml.ToString());
    // close the write operation and complete the PDF file
    htmlToPdf.Close();
}

This component also supports PDF Chapters. You could add a single line of code right before the Run() method to make the HTML specified a single chapter like this:

// open stream to write Pdf to to
htmlToPdf.Open(stream);
 
// add a chapter for this HTML
htmlToPdf.AddChapter("My Chapter Title 1");
 
// write the HTML to the component
htmlToPdf.Run(sbHtml.ToString());

Repeat the AddChapter() and Run() methods for each chapter you want to add and then Close() to commit it to the PDF.

Your PDF should look something like this:

sample pdfizer pdf format

Download PDF using ASP.NET

Once the PDF is created, you can dynamically stream it back to the client browser in ASP.NET on the fly as a file download using code like this:

// clear the http response so nothing else is in the stream so we can just isolate the file bits.
HttpContext.Current.Response.Clear();
// add the HTTP header to tell the browser to accept this as a file.  Also, the friendlypdfname.pdf is the name 
// of the PDF as you want it to appear to the user (regardless of what it is named in your file system).
HttpContext.Current.Response.AddHeader("content-disposition", string.Format("attachment; filename={0}", "friendlypdfname.pdf"));
// tell the browser what type of file this is so it can have a mime type associated with it.
HttpContext.Current.Response.ContentType = "application/pdf";
 
// pass the path that you wrote the file to on your file system as the parameter to WriteFile()

HttpContext.Current.Response.WriteFile(sPathToWritePdfTo);
// end the response and commit the file to the stream
HttpContext.Current.Response.End();

You could put this code in an OnClick() button event or other means that would then stream this new PDF down to your client browser.

Insertion of illegal Element: 32

Insertion of illegal Element: 32

Getting the above error, if not removed from HTML, if removed, it works, But i need this tag to show some text in bold

Thanks

Insertion of illegal Element:

Insertion of illegal Element: 32

Getting the above error, if not removed from HTML, if removed, it works, But i need this tag to show some text in bold

Thanks

facing error

how can i solve the given error

The number of columns in PdfPTable constructor must be greater than zero
below is my code
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment;filename=UserDetails.pdf");
Response.Cache.SetCacheability(HttpCacheability.NoCache);
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);

this.Page.RenderControl(hw);

StringReader sr = new StringReader(sw.ToString());
Document pdfDoc = new Document(PageSize.A2, 7f, 7f, 7f, 0f);
HTMLWorker htmlparser = new HTMLWorker(pdfDoc);
PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
pdfDoc.Open();
htmlparser.Parse(sr);
pdfDoc.Close();
Response.Write(pdfDoc);
Response.End();

i want to convert my aspx page to pdf

how can i convert my aspx page to pdf, because i can't find any useful code which can help me here

getting blank page

Hi everyone, your html content missing some tags, due to mismatch of tags / tags were not ended properly. check the html content once.

Thanks,
Mouli

Error Source

Reminder : this DLL does not support "table" conversion.
see : http://www.codeproject.com/Articles/5872/Pdfizer-a-dumb-HTML-to-PDF-conv...

get blank pdf

hi,
i have use above code then PDF is generate but it's blank?
Please help me for that.

might your html content

might your html content missing some tags, due to mismatch of tags / tags were not ended properly. check the html content once.
Thanks,
Mouli

don't work :(

Hi!
i have been putted this code to my app for generating pdf file... but it can't generate it :( please tell me where i'm doing wrong


StringBuilder sb = new StringBuilder();
/* html to parse */
sb.Append("");
sb.Append("");
sb.Append("");
sb.Append("");

sb.Append("

Test

");

sb.Append("");
sb.Append("");
/* end of html */

String strCertificateTemplatePath = HttpContext.Current.Server.MapPath(CertificateTemplatePath);
fileName = String.Format(CertificateGenerateFilePattern, _user.FirstName, _user.LastName, DateTime.Now.ToString("yyyy-mm-dd HH-MM-ss-fff"));
String filePath = HttpContext.Current.Server.MapPath(CertificateGeneratePath + fileName);

//file stream for PDF
using (System.IO.Stream stream = new System.IO.FileStream(filePath, System.IO.FileMode.OpenOrCreate))
{
Pdfizer.HtmlToPdfConverter htmlToPdf = new Pdfizer.HtmlToPdfConverter();
htmlToPdf.Open(stream);
htmlToPdf.Run(sb.ToString());
htmlToPdf.Close();
}

p.s sorry for bad English...

excelent tutorial

thanks a lot dear..
Its a excelent tutorial...
best wishes ever

thank you

this is great tutorial. thanks for uploading it.

Well formatted HTML not getting blank in out put

i create html with 3 table, but the output is blank PDF !!!!!!!

no result displaying

Error in parsing HTML.

Hi All,

When I try to parse my XML I am getting Argument null exception.

Value cannot be null.
Parameter name: el

Does anyone come across the same error?
If you know the solution for this please help.

Below is my HTML code.

"

Mystifly Logo Mystifly Consulting (India) Private Limited
No 10/4, 6th Floor, Mitra Towers
Kasturba Road, Bangalore
Pin Code-560001, India
Ph: +91 (0)80 427 71000,Fax: +91 (0)80 427 71000
email: crm@mystifly.com | web: www.mystifly.com
Sale Invoice
Ms. Roshni Desai
BCD Travel (Mumbai)
2nd Floor,Pramukh Plaza,#Cardinal Gracious Road, Chakala,ANDHERI (EAST)
Mumbai,#,#
Ph:,#Email:ajay.bali@bcdtravel.in
Invoice Number: 13017
Invoice Date:7/29/2011
Mystifly Reference No:202621
Air Reservation
Passengers:
Mr. TEST TESTA ETicketNo:A2S54DF2AS2DF
Mstr. TESTB TESTB ETicketNo:35AS4DF354SAD
Mstr. TESTI TESTI ETicketNo:DF6G35D4FG4D3
Booking Status: Confirmed
Airline: Air India (AI)
No. Of Passengers :3
Start Date: 18JUN End Date:6JUL
Segment:1
Depart LHR London Heathrow Date: 18JUN Time :2130
Arrival DEL Indira Gandhi Intl Date: 18JUN Time : 1030
Flight No:AI112
Class:Economy
Segment:2
Depart DEL Indira Gandhi Intl Date: 18JUN Time :1255
Arrival HYD Hyderabad Arpt Date: 18JUN Time : 1510
Flight No:AI544
Class:Economy
Segment:3
Depart HYD Hyderabad Arpt Date: 6JUL Time :0640
Arrival DEL Indira Gandhi Intl Date: 6JUL Time : 0835
Flight No:AI559
Class:Economy
Segment:4
Depart DEL Indira Gandhi Intl Date: 6JUL Time :1410
Arrival LHR London Heathrow Date: 6JUL Time : 1900
Flight No:AI111
Class:Economy
Pricing Details
Price Per Person # (INR) Total # (INR)
Pax Type Base Airport Tax Service Tax Total Count Total
Adult 42,763.44 600.00 528.56 43,892.00 1 43,892.00
Child 493.90 600.00 6.10 1,100.00 1 1,100.00
Infant 493.90 600.00 6.10 1,100.00 1 1,100.00
Invoice Total #(INR)
Total Price
Base Price: 43,751.23
Service Tax Reg. No: AAGCM1234PSD001 Airport Tax: 1,800.00
Category of Service: Air Travel Agents Service Tax: 540.77
Total Price: 46,092.00
Account Details
Bank: CITI Bank For Mystifly Consulting (India) Private Limited
Account Name: Mystifly Consulting (India) Private Limited
Account Number: 0142896807
Branch Name: M.G.Road Branch
IFSC Code: CITI0000004 Authorised Signature
This is a computer generated Invoice. Signature is not required.

"

My Custom Style is not going to apply.

Hi i am using this cod eto write html to pdf.
I am using div elements tat having stype attribute. But thsi style is not going to apply on this.

Html Not Writing

Hi i am using the following html
sbHtml.Append(""); sbHtml.Append("");
sbHtml.Append("Google");
sbHtml.Append(""); sbHtml.Append("");

But the above code giving the following error on line
htmlToPdf.Run(sbHtml.ToString());

Object reference not set to an instance of an object.

What wil the solution for it.

Thanks

html file to pdf

Document document = new Document(PageSize.A4, 80, 50, 30, 65);
StringBuilder strData = new StringBuilder(string.Empty);
string strHTMLpath = Server.MapPath("MyHTML.html");
string strPDFpath = Server.MapPath("MyPDF.pdf");
StringWriter sw = new StringWriter();
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
HtmlTextWriter htw = new HtmlTextWriter(sw);

StreamWriter strWriter = new StreamWriter(strHTMLpath, false, Encoding.UTF8);
strWriter.Write("
" + htw.InnerWriter.ToString() + "");
strWriter.Close();
strWriter.Dispose();
iTextSharp.text.html.simpleparser.
StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet();
styles.LoadTagStyle("ol", "leading", "16,0");
PdfWriter.GetInstance(document, new FileStream(strPDFpath, FileMode.Create));
document.Add(new Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, "Style.css"));
document.Open();
//ArrayList objects;
styles.LoadTagStyle("li", "face", "garamond");
styles.LoadTagStyle("span", "size", "8px");
styles.LoadTagStyle("body", "font-family", "times new roman");
styles.LoadTagStyle("body", "font-size", "10px");
document.NewPage();
List objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(new StreamReader(strHTMLpath, Encoding.Default), styles);

//objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(new StreamReader(strHTMLpath, Encoding.Default), styles);
for (int k = 0; k < objects.Count; k++)
{
document.Add((IElement)objects[k]);
}
document.Close();
Response.Write(Server.MapPath("~/" + strPDFpath));
Response.ClearContent();
Response.ClearHeaders();
Response.AddHeader("Content-Disposition", "attachment; filename=" + strPDFpath);
Response.ContentType = "application/octet-stream";
Response.WriteFile(Server.MapPath("~/" + strPDFpath));
Response.Flush();
Response.Close();
if (File.Exists(Server.MapPath("~/" + strPDFpath)))
{
File.Delete(Server.MapPath("~/" + strPDFpath));
}

Erro

I am trying to use this to convert html to pdf. wheneever i try it throws a n error message saying 'CurrentChapter is not defined"..any idea..??

Great contribution

Mad props for this contribution. Simple way of making simple PDF files from HTML. Thanks a lot.

iTextSharp AGPL license

iTextSharp v5 is only free for non-commercial activities, and it's support for HTML is partial.

Have a look at ABCpdf.NET if you need full HTML and CSS, live links and forms, etc. The standard edition can be used for commercial applications free of charge - look out for the special offer section.

http://www.websupergoo.com/abcpdf-1.htm

Thanks

Dear Friend

thank you very much for putting such a valiable codes to export HTML to PDF.

by
Aldrin
Dubai

ercan

Very useful information. I was very pleased. Thanks

Some help required with this

I'm trying to include a html table with this. The table appears fine, but it comes with a wide margin on either side of the page, this is making the table too cramped. How can I make the table spread through the width of the pdf page? Please help!

KD

About the itextsharp version

Hi, this was a very helpful post! However, I already have a version of itextsharp.dll (4.0.2.0), and the one available with this component is version 1.0.4.0. I need to use the 4.0 version in my application, and I need to find a way to get this component to refer the 4.0 dll and not look for the 1.0.4 version. How do I do this? Any help would be welcome.

I get this error now - Could not load file or assembly 'itextsharp, Version=1.0.4.0, Culture=neutral, PublicKeyToken=null' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)

Thanks
KD

medyum

I am to a great extent impressed with the article I have just read interesting very good

Hi...I am passing html file

Hi...I am passing html file which contains pictures too..Its giving an error that Data at the root level is invalid. Line 1, position 1..what does this mean?

It means your HTML is not

It means your HTML is not Valid

Page Break

How I can make a "Page Break"?
Thank you, Rajko

I am using following

I am using following code:
StringBuilder s = new StringBuilder();
//test aspx = new test();
StringWriter sw = new StringWriter(s);
HtmlTextWriter writer = new Html32TextWriter(sw);
base.Render(writer);

System.IO.Stream stream = new System.IO.FileStream(@"C:\new_pdf_name6.pdf", FileMode.OpenOrCreate);
Pdfizer.HtmlToPdfConverter htmlToPdf = new Pdfizer.HtmlToPdfConverter();
// open stream to write Pdf to to
htmlToPdf.Open(stream);
// write the HTML to the component
htmlToPdf.Run(s.ToString());
// close the write operation and complete the PDF file
htmlToPdf.Close();

This is giving following error on this statement:
htmlToPdf.Run(s.ToString());

Error is: The remote server returned an error: (503) Server Unavailable.

Please help

can you add your source code

thank you for a quite nice html2pdf machine...
the thing is that i want to add new font styles with CP1250 encoding and it sounds impossible with this old itextsharp.dll library, besides this one is not compatibile with the newest version.
the original code doesn't support [table],[br],[p] and so on...
Michal

Image Support Updated

I think the reason your image support worked is because your image paths look to be local paths.  What most have tried and failed with is using images that are at URL paths on the web.  The component seems to have no functionality to download the images and put them in the PDF.  But it looks like if the images are a local path, it works.  I think that might reconcile the difference.

Image Support WORKS!

I'm not sure why others had a problem with it, however, I was able to use this with images without any problem. Below is the basic HTML that I generated (my images were 800X600).

<html>
<body>
<table height='800' width='600'>
<tr><td><img src='filepath1' /></td></tr>
<tr><td><img src='filepath2' /></td></tr>
<tr><td><img src='filepath3' /></td></tr>
</table>
</body>
</html>

This is a great little free method to get simple PDFs created. I highly recommend it!

Damn and blast that really sucks.....

Damn and blast that really sucks..... Its unfortunate that there isn't a open source Html to Pdf converter that can compete with the commercial solutions with out the ridiculous licensing costs. Thanks for the quick reply none the less.

~3~

I hadn't tried images but it looks like

I hadn't tried images but it looks like support is not good. It seems to just ignore your image URL's. Kind of sucks. When I put this together, I only had need for text and formatting. Looks like that is as far as this component goes.

Great article, helped a ton, just one

Great article, helped a ton, just one question, what is the support for images like? I have a document rendered in valid XHTML but by adding images the pdf only contains a single table cell, it doesn't contain any of the HTML I had fed to it! Any ideas?

~3~

TABLEs not working

Anyone able to get this to work?

has this been solved, have the same problem

Hi, i have the same problem i am dynamically building the html( i have checked and the output is correct however it does not show anything on the pdf, any idea's

Simple and good article.

thank you for posting the code and necessary info.... its a great article.
There are so many paid products available. Some of the PDF converters are very close to the original Windows XP cost. Thanks again !!

Thanks

Very useful information. I was very pleased. Thanks

ercan

Very useful information. I was very pleased. Thanks

Getting Error !!

Server Error in '/NewJcs' Application.
'0' is an unexpected token. The expected token is '"' or '''. Line 1, position 791.
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.Xml.XmlException: '0' is an unexpected token. The expected token is '"' or '''. Line 1, position 791.

Source Error:

Line 492: htmlToPdf.Open(stream);
Line 493: // write the HTML to the component
Line 494: htmlToPdf.Run(sbHtml);
Line 495: // close the write operation and complete the PDF file
Line 496: htmlToPdf.Close();

Source File: d:\Shared\Project\Under My Documents\NewJcs\App_Code\jwstore.cs Line: 494

Stack Trace:

[XmlException: '0' is an unexpected token. The expected token is '"' or '''. Line 1, position 791.]
System.Xml.XmlTextReaderImpl.Throw(Exception e) +76
System.Xml.XmlTextReaderImpl.Throw(String res, String[] args) +88
System.Xml.XmlTextReaderImpl.ThrowUnexpectedToken(String expectedToken1, String expectedToken2) +104
System.Xml.XmlTextReaderImpl.ParseAttributes() +3978624
System.Xml.XmlTextReaderImpl.ParseElement() +343
System.Xml.XmlTextReaderImpl.ParseElementContent() +121
System.Xml.XmlTextReaderImpl.Read() +45
System.Xml.XmlLoader.LoadNode(Boolean skipOverWhitespace) +58
System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc) +20
System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace) +129
System.Xml.XmlDocument.Load(XmlReader reader) +108
System.Xml.XmlDocument.LoadXml(String xml) +113
Pdfizer.HtmlToPdfConverter.Run(String html) +53
jwstore.html2pdf(String sbHtml, String sPathToWritePdfTo) in d:\Shared\Project\Under My Documents\NewJcs\App_Code\jwstore.cs:494
Secure_confirmpayment.generate_text() in d:\Shared\Project\Under My Documents\NewJcs\secured\confirmpayment1.aspx.cs:551
Secure_confirmpayment.Page_Load(Object sender, EventArgs e) in d:\Shared\Project\Under My Documents\NewJcs\secured\confirmpayment1.aspx.cs:395
System.Web.Util.CalliHelper.EventArgFunctionCaller(IntPtr fp, Object o, Object t, EventArgs e) +14
System.Web.Util.CalliEventHandlerDelegateProxy.Callback(Object sender, EventArgs e) +35
System.Web.UI.Control.OnLoad(EventArgs e) +99
System.Web.UI.Control.LoadRecursive() +50
System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +627

Version Information: Microsoft .NET Framework Version:2.0.50727.3603; ASP.NET Version:2.0.50727.3082

Table not working properly

Hi image is working properly. but table is not working, it is showing first only & ignoring the remaining td's
can any body worked with tables

regards
Srini

Need Desigened PDF

I have done all this and i am generating a custom calender on PDF.
I want to designed the table that is customized .
How to use StyleSheet for this...

please rply on guru.t8@gmail.com

Thanx In Advance...