I want to generate a PDF by passing HTML contents to a function. I have made use of iTextSharp for this but it does not perform well when it encounters tables and the layout just gets messy. Is there a better way?
7,586 6 6 gold badges 57 57 silver badges 105 105 bronze badges asked Feb 19, 2009 at 10:21 SandHurst SandHurstYou can use GemBox.Document for this. Also here you can find a sample code for converting HTML file into a PDF file.
Commented Jan 25, 2016 at 7:51 Which version of iTextSharp do you use and could you share your html? Commented Mar 15, 2016 at 17:35Still no answer to my request for additional information. Please also add if you are using HTMLWorker or XMLWorker.
Commented Jun 10, 2016 at 15:28 What about .net core? Commented Apr 12, 2019 at 9:10Can we please reopen this one? Many new products provide this functionality, others are out of date. Without new answers, this can not be easily lined out. For 2022 I would recommend: github.com/hardkoded/puppeteer-sharp#generate-pdf-files Is well established, well maintained, simple to use, built on a solid basis etc.
Commented Jan 18, 2023 at 18:56This will need Windows OS.
I have tested with ASP.net Core application and I used .net8.0-windows as target framework in order to use this nuget I also installed webview runtime
Also creator indicated windows desktop runtime as dependency too.
/// /// Return raw data as PDF /// /// [HttpGet("rawpdfex")] public async Task RawPdf() < var file = Path.GetFullPath("./HtmlSampleFile-SelfContained.html"); var pdf = new HtmlToPdfHost(); var pdfResult = await pdf.PrintToPdfStreamAsync(file, new WebViewPrintSettings < PageRanges = "1-10" >); if (pdfResult == null || !pdfResult.IsSuccess) < Response.StatusCode = 500; return new JsonResult(new < isError = true, message = pdfResult.Message >); > return new FileStreamResult(pdfResult.ResultStream, "application/pdf"); >
(After trying wkhtmltopdf and suggesting to avoid it)
HtmlRenderer.PdfSharp is a 100% fully C# managed code, easy to use, thread safe and most importantly FREE (New BSD License) solution.
public static Byte[] PdfSharpConvert(String html) < Byte[] res = null; using (MemoryStream ms = new MemoryStream()) < var pdf = TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf(html, PdfSharp.PageSize.A4); pdf.Save(ms); res = ms.ToArray(); >return res; >
A very Good Alternate Is a Free Version of iTextSharp
Until version 4.1.6 iTextSharp was licensed under the LGPL licence and versions until 4.16 (or there may be also forks) are available as packages and can be freely used. Of course someone can use the continued 5+ paid version.
I tried to integrate wkhtmltopdf solutions on my project and had a bunch of hurdles.
I personally would avoid using wkhtmltopdf - based solutions on Hosted Enterprise applications for the following reasons.
--- PRE Edit Section ---
For anyone who want to generate pdf from html in simpler applications / environments I leave my old post as suggestion.
or Especially For MVC Web Applications (But I think you may use it in any .net application)
They both utilize the wkhtmtopdf binary for converting html to pdf. Which uses the webkit engine for rendering the pages so it can also parse css style sheets.
They provide easy to use seamless integration with C#.
Rotativa can also generate directly PDFs from any Razor View.
Additionally for real world web applications they also manage thread safety etc.
11.6k 14 14 gold badges 84 84 silver badges 116 116 bronze badges answered Aug 11, 2015 at 14:35 Anestis Kivranoglou Anestis Kivranoglou 8,053 5 5 gold badges 46 46 silver badges 48 48 bronze badges Thank you for updating your post. I'm going to give PdfSharp a try. You saved me a lot of time. Commented Aug 17, 2015 at 17:11PdfSharp is good in terms of performance, but it didn't render floats properly for me. Luckily, I could change the markup to use good old tables, PdfSharp handles them well.
Commented Sep 14, 2015 at 19:12We tried HtmlRenderer. It was really quick when not loading any CSS. But when we tried to apply CSS (Bootstrap plus some bespoke), the CSS parsing took a while (which we could probably mitigate), and rendering was completely different to the web page.
Commented Nov 10, 2015 at 22:42BS. This creates an image of the HTML and adds the image into the pdf file. This is not a real PDF at all. Also, PDF is a vector graphics format - you can scroll near infinitely - of course except if the PDF consists of a raster graphic, which is what this library produces.
Commented Jun 6, 2017 at 9:12@Anestis Kivranoglou i have used pdf sharp on my project. But for html design with css, it cannot render the html. Instead it is only creating a blank page
Commented Jun 21, 2019 at 11:59Last Updated: October 2020
This is the list of options for HTML to PDF conversion in .NET that I have put together (some free some paid)
If none of the options above help you you can always search the NuGet packages:
https://www.nuget.org/packages?q=html+pdf
have you tested any for performance ? we are looking to improve current conversion times and are exploring other libraries for these performance benefits
Commented May 14, 2020 at 11:10Another wkhtmtopdf based solution that will even work on Azure web services is DinkToPdf fork: github.com/hakanl/DinkToPdf with nuget: nuget.org/packages/Haukcode.DinkToPdf
Commented Jul 5, 2020 at 8:39 DinkToPdf is free and works in .net core. nuget.org/packages/DinkToPdf Commented Sep 12, 2020 at 17:46 @FritsJ there are plenty of options from the list ;-) Commented Dec 23, 2020 at 15:13update this list!! Also, check this solution: github.com/eKoopmans/html2pdf.js#getting-started It got me VERY far down the rabbithole, until .dotnet 6 broke it and I had to start again.
Commented May 27, 2022 at 21:00I highly recommend NReco, seriously. It has the free and paid version, and really worth it. It uses wkhtmtopdf in background, but you just need one assembly. Fantastic.
var htmlContent = String.Format("Hello world: ", DateTime.Now); var pdfBytes = (new NReco.PdfGenerator.HtmlToPdfConverter()).GeneratePdf(htmlContent);
Disclaimer: I'm not the developer, just a fan of the project :)
answered Apr 23, 2015 at 19:53 Kim Tranjan Kim Tranjan 4,521 3 3 gold badges 40 40 silver badges 39 39 bronze badgesLooks indeed pretty useful. Worth noting that as of today (05/10/15), it's the most downloaded .Net wrapper for wkhtmtopdf (as a Nuget package).
Commented Oct 5, 2015 at 16:14 Tried it, unfortunately I couldn't make it work on azure's web pages. Commented Oct 6, 2015 at 19:32This library works fine when I run it locally on my machine, but on the hosting server, I am seeing the following error randomly. Pdf gets generated sometimes but sometimes it throws the following error. "Error. An error occurred while processing your request. Cannot generate PDF: (exit code: 1)"
Commented Jan 26, 2016 at 16:43wkhtmtopdf depends on GDI+, or x-server if you're running on Mono/Linux. So this is not useful for server environments.
Commented Jul 29, 2017 at 0:22 Its good and working as expected but bit quality issue i see in my pdf , can we improve this ? Commented Aug 30, 2017 at 5:41Most HTML to PDF converter relies on IE to do the HTML parsing and rendering. This can break when user updates their IE. Here is one that does not rely on IE.
The code is something like this:
EO.Pdf.HtmlToPdf.ConvertHtml(htmlText, pdfFileName);
Like many other converters, you can pass text, file name, or Url. The result can be saved into a file or a stream.
7,055 13 13 gold badges 60 60 silver badges 96 96 bronze badges answered Apr 12, 2011 at 13:06 475 5 5 silver badges 2 2 bronze badges it is not useful because you must purchase the library Commented Mar 25, 2013 at 20:06d1jhoni1b, how does this make it not useful? If it is a pay-for tool, then it might be said to be expensive, but not useless on that criteria alone.
Commented Sep 30, 2013 at 16:27It's true EO.Pdf doesn't use IE. But it does seem to spawn 32 bit instances of a webkit browser in the background. Check your process list and you will see them as rundll32.exe instances pointing to the EO.PDF dll. So it still is a bit hacky in my opinion.
Commented Feb 27, 2015 at 22:24 It doesn't support media="print" which is really painful. Commented Jul 2, 2015 at 11:05 Single developer licence for $650. That's costly. Commented Aug 17, 2015 at 9:26For all those looking for an working solution in .net 5 and above here you go.
Here are my working solutions.
public static string HtmlToPdf(string outputFilenamePrefix, string[] urls, string[] options = null, string pdfHtmlToPdfExePath = @"C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe") < string urlsSeparatedBySpaces = string.Empty; try < //Determine inputs if ((urls == null) || (urls.Length == 0)) throw new Exception("No input URLs provided for HtmlToPdf"); else urlsSeparatedBySpaces = String.Join(" ", urls); //Concatenate URLs string outputFilename = outputFilenamePrefix + "_" + DateTime.Now.ToString("yyyy-MM-dd-hh-mm-ss-fff") + ".PDF"; // assemble destination PDF file name var p = new System.Diagnostics.Process() < StartInfo = < FileName = pdfHtmlToPdfExePath, Arguments = ((options == null) ? "" : string.Join(" ", options)) + " " + urlsSeparatedBySpaces + " " + outputFilename, UseShellExecute = false, // needs to be false in order to redirect output RedirectStandardOutput = true, RedirectStandardError = true, RedirectStandardInput = true, // redirect all 3, as it should be all 3 or none WorkingDirectory = Path.Combine(Path.GetDirectoryName(Assembly.GetEntryAssembly().Location)) >>; p.Start(); // read the output here. var output = p.StandardOutput.ReadToEnd(); var errorOutput = p.StandardError.ReadToEnd(); // . then wait n milliseconds for exit (as after exit, it can't read the output) p.WaitForExit(60000); // read the exit code, close process int returnCode = p.ExitCode; p.Close(); // if 0 or 2, it worked so return path of pdf if ((returnCode == 0) || (returnCode == 2)) return outputFilename; else throw new Exception(errorOutput); > catch (Exception exc) < throw new Exception("Problem generating PDF from HTML, URLs: " + urlsSeparatedBySpaces + ", outputFilename: " + outputFilenamePrefix, exc); >>
Drawbacks of this approach:
var p = new System.Diagnostics.Process() < StartInfo = < FileName = "C:/Program Files (x86)/Google/Chrome/Application/chrome.exe", Arguments = @"/C --headless --disable-gpu --run-all-compositor-stages-before-draw --print-to-pdf-no-header --print-to-pdf=""C:/Users/Abdul Rahman/Desktop/test.pdf"" ""C:/Users/Abdul Rahman/Desktop/grid.html""", >>; p.Start(); // . then wait n milliseconds for exit (as after exit, it can't read the output) p.WaitForExit(60000); // read the exit code, close process int returnCode = p.ExitCode; p.Close();
@"/C --headless --disable-gpu --run-all-compositor-stages-before-draw --print-to-pdf-no-header --print-to-pdf=""C:/Users/Abdul Rahman/Desktop/test.pdf"" ""https://www.google.com""",
Drawbacks of this approach:
public async Task ConvertHtmlToPdf(string html) < var directory = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.CommonDocuments), "ApplicationName"); Directory.CreateDirectory(directory); var filePath = Path.Combine(directory, $".html"); await File.WriteAllTextAsync(filePath, html); var driverOptions = new ChromeOptions(); // In headless mode, PDF writing is enabled by default (tested with driver major version 85) driverOptions.AddArgument("headless"); using var driver = new ChromeDriver(driverOptions); driver.Navigate().GoToUrl(filePath); // Output a PDF of the first page in A4 size at 90% scale var printOptions = new Dictionary < < "paperWidth", 210 / 25.4 >, < "paperHeight", 297 / 25.4 >, < "scale", 0.9 >, < "pageRanges", "1" >>; var printOutput = driver.ExecuteChromeCommandWithResult("Page.printToPDF", printOptions) as Dictionary; var pdf = Convert.FromBase64String(printOutput["data"] as string); File.Delete(filePath); return pdf; >
Advantage of this method:
Drawbacks of this approach:
The above drawbacks can be overcome if we are running app in docker. All we need to do is to install chrome when building app image using Dockerfile
With this approach, please make sure to add
net5.0 latest enable true
This will publish the chrome driver when publishing the project.
Here is the link to my working project repo - HtmlToPdf
If the users are using your app from browser then you can rely on JavaScript and use window.print() and necessary print media css to generate PDF from the browser. For example generating invoice from browser in an inventory app.
Advantage of this method:
Drawbacks of this approach:
I arrived at the above answer after almost spending 2 days with available options and finally implemented Selenium based solution and it's working. Hope this helps you and save your time.