Capturing the Website and Sending it as a PDF via Email at Certain Time Intervals
Hi,
Today we will talk about capturing a website, converting it to PDF, and emailing it to target users at certain time intervals.
Firstly, we will create an ASP.NET Core Web API application on Visual Studio 2022. And don’t forget to select .NET 8.0 for the .Net version.
Let’s Figure Out How To Capture a WebSite
You can use a couple of tools for capturing a WebSite. I prefer the Community Edition of the “SelectPdf” Tool. It is free for first 5 pages. More than 5 pages you have to buy a developer license.
- “SelectPdf.HtmlToPdf converter”: This is a Pdf converter tool
- “MaxPageLoadTime”: We will wait for Url’s response until 3 minutes.
- “PdfPageSize”: We will set the converted PDF’s page size as an A4.
- “LoginOptions.DelayAfter = 5000”: Just in case, we will wait 5 seconds for the page to load.
- “SelectPdf.PdfDocument doc = converter.ConvertUrl(URL)”: We will convert a web page to PDF.
- “Guid.NewGuid()”: We will set the Unique name to converted PDF.
- “doc.Save($” C:\\TestPdf\\{guid}.pdf”)”: We will save the converted PDF to a specific folder with a unique name.
ReportController.cs/GetPdfFromUrl():
[ApiExplorerSettings(IgnoreApi = true)]
[HttpGet(Name = "GetPdfFromUrl")]
public string GetPdfFromUrl(string url)
{
SelectPdf.HtmlToPdf converter = new SelectPdf.HtmlToPdf();
converter.Options.MaxPageLoadTime = 120;
converter.Options.PdfPageSize = SelectPdf.PdfPageSize.A4;
converter.Options.JavaScriptEnabled = true;
converter.Options.LoginOptions.DelayAfter = 5000; // 5 seconds wait for page loading..
SelectPdf.PdfDocument doc = converter.ConvertUrl(url);
string guid = Guid.NewGuid().ToString();
doc.Save($"C:\\TestPdf\\{guid}.pdf");
doc.Close();
return $"C:\\TestPdf\\{guid}.pdf";
}
“I say something, and then it usually happens. Maybe not on schedule, but it usually happens.” — Elon Musk
Now Let’s Talk About How To Send This PDF to Target Users By Email
We will use core .Net Libraries “System.Net.Mail, System.Net;” for sending email.
- “string filePath = GetPdfFromUrl(“web page URL”)”: We will capture WebPage from the URL, save it to the specific path and return it.
- “var client = new SmtpClient(“SMTP host”, port)”: Initializes a new instance of the SmtpClient class that sends email by using the specified SMTP server and port.
- “using (MailMessage mail = new MailMessage())”: Represents an email message that can be sent using the SmtpClient class.
- mail.Attachments.Add(): You can add any file to the mail. In this scenario, we will add the web page’s PDF.
- mail.Subject: Subject of the mail.
- mail.Body: You can write to the mail body whatever you want.
- mail.To.Add(“”): You can add Target Users’s email addresses with to “To. Add()” method.
- mail.From: You can write whatever email, which will be seen you want at “mail From”.
- client.Send(mail): We will send an email with “client.Send()” method.
- System.IO.File.Delete(filePath): Don’t forget to delete the email after sending it. We do not want the PDF file to take up space on the storage.
ReportController.cs/Get():
[HttpGet(Name = "GetReport")]
public bool Get()
{
try
{
string filePath = GetPdfFromUrl("https://www.borakasmer.com/hakkimda/");
var client = new SmtpClient("smtp client", 587)
{
Credentials = new NetworkCredential("your mail", "your password"),
};
using (MailMessage mail = new MailMessage())
{
mail.Attachments.Add(new Attachment(filePath));
mail.Subject = "About Bora Kamser ®borakasmer.com";
mail.Body = "Who is Bora Kasmer ? And what is he doing ? ®borakasmer.com";
mail.To.Add("bora.kasmer78@gmail.com, hrndlpyrz@gmail.com");
mail.From = new MailAddress("bora@borakasmer.com");
client.Send(mail);
}
//file delete from path
System.IO.File.Delete(filePath);
}
catch (Exception ex)
{
return false;
}
return true;
}
You can see the email below, after sending it.
“There cannot be a crisis next week. My schedule is already full.” — Henry Kissinger
Now Let’s Repeat This Process Weekly
We will use Hangfire for processing background jobs. We will add the below packages from Nuget.
1 Firstly I created a Service folder after all I created the “ICoreService” interface and finally, I inherited the “CoreService” class from this interface. And I moved the above ReportController’s methods into this class.
Service/ICoreService.cs:
namespace GetPageandSendEmail.Service
{
public interface ICoreService
{
public string GetPdfFromUrl(string url);
public bool SetMailWebReport(string url);
}
}
Service/CoreService.cs:
using System.Net.Mail;
using System.Net;
namespace GetPageandSendEmail.Service
{
public class CoreService : ICoreService
{
public string GetPdfFromUrl(string url)
{
SelectPdf.HtmlToPdf converter = new SelectPdf.HtmlToPdf();
converter.Options.MaxPageLoadTime = 120;
converter.Options.PdfPageSize = SelectPdf.PdfPageSize.A4;
converter.Options.JavaScriptEnabled = true;
converter.Options.LoginOptions.DelayAfter = 5000; // 5 seconds wait for page loading..
SelectPdf.PdfDocument doc = converter.ConvertUrl(url);
string guid = Guid.NewGuid().ToString();
doc.Save($"C:\\TestPdf\\{guid}.pdf");
doc.Close();
return $"C:\\TestPdf\\{guid}.pdf";
}
public bool SendMailWebReport(string url)
{
try
{
string filePath = GetPdfFromUrl(url);
var client = new SmtpClient("SMTP client", 587)
{
Credentials = new NetworkCredential("username", "password"),
};
//add pdf attachment filepath to client
using (MailMessage mail = new MailMessage())
{
mail.Attachments.Add(new Attachment(filePath));
mail.Subject = "About Bora Kamser ®borakasmer.com";
mail.Body = "Who is Bora Kasmer ? And what is he doing ? ®borakasmer.com";
mail.To.Add("bora.kasmer78@gmail.com, hrndlpyrz@gmail.com");
mail.From = new MailAddress("bora@borakasmer.com");
client.Send(mail);
}
//file delete from path
System.IO.File.Delete(filePath);
}
catch (Exception ex)
{
return false;
}
return true;
}
}
}
2Second I have to describe this SendMailWebReport() Method’s Running Period.
- “RecurringJob.RemoveIfExists()”: If exists, we will remove this Job.
- “RecurringJob.AddOrUpdate<ICoreService>(name of (service.SendMailWebReport), x =>”: We will add the “SendMailWebReport()” method with its name. The name of the method must be unique here.
- “x.SendMailWebReport(URL),”: We will declare the operation here which method call. We will send “URL” as a parameter.
- “Cron.Weekly(DayOfWeek.Friday,17, 30)”: This is a declaration of the working period of the calling method. Cron (Command-line Utility) is a job scheduler on Unix-like operating systems. We will set the interval every week on Friday at 17:30. You can make a practice about Cron on this site https://crontab.guru/
- “TimeZoneInfo.FindSystemTimeZoneById(“Turkey Standard Time”)”: You can set a specific TimeZone as in this example.
RecurringJob.cs:
using GetPageandSendEmail;
using GetPageandSendEmail.Service;
using Hangfire;
namespace GetPageandSendEmail
{
public static class RecurringJobs
{
[Obsolete]
public static void GetWeeklyWebReport(ICoreService service, string url)
{
RecurringJob.RemoveIfExists(nameof(service.SendMailWebReport));
RecurringJob.AddOrUpdate<ICoreService>(nameof(service.SendMailWebReport), x => x.SendMailWebReport(url),
Cron.Weekly(DayOfWeek.Friday,17, 30), TimeZoneInfo.FindSystemTimeZoneById("Turkey Standard Time")); //17:30
}
}
}
“I’m not busy… a woman with three children under the age of 10 wouldn’t think my schedule looked so busy.”
— Garrison Keillor
3 Third We will add some configuration to appsettings.json for Hangfire. We will add a Username, and password and we will set which web page parsing with ParsingUrl parameters. In real life, keep such critical parameters encrypted on configuration or keep it in the cloud.
appsettings.json:
"HangfireSettings": {
"UserName": "admin",
"Password": "123456",
"ParsingUrl": "https://www.borakasmer.com/hakkimda/"
},
4 Fourth We will add some description and configuration to the Program.cs for Hangfire.
Program.cs(1): Hangfire needs a Database connection. Because it keeps everything in a couple of data tables. When Hangfire connects the Database server, if the database does not exist, it will create the Database and its tables automatically. You can create a new Hangfire database or you can create Hangfire Data tables in an existing Database. We will add ICoreService for IOC as a Transient.
Program.cs(1):
//Hangfire
builder.Services.AddHangfire(x => x.UseSqlServerStorage("Data Source=192.168.50.173;initial catalog=Hangfire;User Id=****;Password=*****;TrustServerCertificate=True;"));
builder.Services.AddHangfireServer();
builder.Services.AddTransient<ICoreService, CoreService>();
5 Fifth We will add HangFireDashBoard Credential. We can open the Hangfire dashboard with a “/job” parameter like swagger “/swagger”. For Authorization, we have to set a username and password. We can monitor everything on the Hangfire Dashboard.
Program.cs(2):
app.UseHangfireDashboard("/job", new DashboardOptions
{
Authorization = new[]
{
new HangfireCustomBasicAuthenticationFilter
{
User = _configuration.GetSection("HangfireSettings:UserName").Value,
Pass = _configuration.GetSection("HangfireSettings:Password").Value
}
}
});
app.UseHangfireServer(new BackgroundJobServerOptions());
GlobalJobFilters.Filters.Add(new AutomaticRetryAttribute { Attempts = 3 });
If we select the Recurring Jobs, we can see details of the Scheduled Job’s name, last working time, created date, working period, and next working time, and if we want we can trigger the Job manually.
“GlobalJobFilters.Filters.Add(new AutomaticRetryAttribute { Attempts = 3 })”: If Hangfire gets an error when calling the job, it will retry 3 times by this description.
6 Sixth and final step, when the application starts we will create CoreService manually and call the “GetWeeklyReport()” method once at the beginning. After we will give this service and ParsingUrl as a parameter to this method. After all, every week on Friday this Job will triggered at 17:30 until IIS stops.
Program.cs(3):
var serviceProvider = builder.Services.BuildServiceProvider();
var _coreService = serviceProvider.GetService<ICoreService>();
RecurringJobs.GetWeeklyWebReport(_coreService, _configuration.GetSection("HangfireSettings:ParsingUrl").Value);
“I put my heart and my soul into my work, and have lost my mind in the process.” — Vincent Van Gogh
Program.cs(Full):
using GetPageandSendEmail.Service;
using GetPageandSendEmail;
using Hangfire;
using HangfireBasicAuthenticationFilter;
var builder = WebApplication.CreateBuilder(args);
// Add services to the container.
builder.Services.AddControllers();
// Learn more about configuring Swagger/OpenAPI at https://aka.ms/aspnetcore/swashbuckle
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();
//Hangfire
builder.Services.AddHangfire(x => x.UseSqlServerStorage("Data Source=192.168.50.173;initial catalog=Hangfire;User Id=***;Password=****;TrustServerCertificate=True;"));
builder.Services.AddHangfireServer();
builder.Services.AddTransient<ICoreService, CoreService>();
var app = builder.Build();
// Configure the HTTP request pipeline.
if (app.Environment.IsDevelopment())
{
app.UseSwagger();
app.UseSwaggerUI();
}
app.UseHttpsRedirection();
app.UseAuthorization();
app.MapControllers();
IConfiguration _configuration = new ConfigurationBuilder()
.AddJsonFile("appsettings.json")
.Build();
//app.UseHangfireDashboard();
//Install-Package Hangfire.Dashboard.Basic.Authentication
//app.UseHangfireDashboard("/job", new DashboardOptions());
app.UseHangfireDashboard("/job", new DashboardOptions
{
Authorization = new[]
{
new HangfireCustomBasicAuthenticationFilter
{
User = _configuration.GetSection("HangfireSettings:UserName").Value,
Pass = _configuration.GetSection("HangfireSettings:Password").Value
}
}
});
app.UseHangfireServer(new BackgroundJobServerOptions());
GlobalJobFilters.Filters.Add(new AutomaticRetryAttribute { Attempts = 3 });
var serviceProvider = builder.Services.BuildServiceProvider();
var _coreService = serviceProvider.GetService<ICoreService>();
RecurringJobs.GetWeeklyWebReport(_coreService, _configuration.GetSection("HangfireSettings:ParsingUrl").Value);
app.Run();
Conclusion:
In this article, we tried to capture a web page which could be a report page, converted it to PDF, and sent emails to the specific users.
In a real-life scenario, there was a need to send the Report page prepared on the Client Side to customers in the same format as an e-mail. Since it was impossible to do this from the client side at certain time intervals, the page was captured to get the same format as the web page on the backend side. Because even if report data about the code was captured, it was impossible to match the CSS and style on the client side. Additionally, in case of a design change on the client side, reflecting this in the code would cause extra effort and small changes could be overlooked.
We could also do the sending process at certain time intervals with the Console App or workers. But for example, on the cloud, we would need an additional virtual machine for this. However, with applications such as Hangfire or Quartz, we can configure such repetitive tasks by running them on IIS without needing an extra machine.
Although the timing of the jobs to be done is defined when the project is first started, new jobs can be dynamically added or removed from Hangfire at different run times in different situations with simple WebApi calls.
Dynamic Add Hangfire Job With WebApi Example:
[HttpPost]
[Route("[action]")]
public IActionResult GetHourlyWeatherReport()
{
WeatherReport weather = new(_service);
RecurringJob.RemoveIfExists(nameof(weather.ReportWeather));
RecurringJob.RemoveIfExists(nameof(weather.ReportWeather2));
RecurringJob.AddOrUpdate(() => weather.ReportWeather("bora@borakasmer.com"), Cron.Daily(10, 38)); //Daily 1:38
RecurringJob.AddOrUpdate(() => weather.ReportWeather2(), "2 * * * *"); //Every 2 minute
return Ok();
}
One day if you want to reset or remove Hangfire tables you can use these SQL scripts.
These tools and techniques may not be a magic stick, but they will definitely make our jobs much easier in our daily lives.
See you until the next article.
“If you have read so far, first of all, thank you for your patience and support. I welcome all of you to my blog for more!”