See my Github page for the full source code described below.
Custom pre-rendering solution using PuppeteerSharp for a static website hosted on AWS S3 with source data stored in DynamoDB.
This solution is highly customized for a particular website. However, the concepts covered and the solution structure may provide value to others struggling with the same problems, namely search engine optimization for websites using early versions of Angular. These instructions will give you an idea of how to customize the solution for your own purposes.
A Chrome executable, a website URL to be rendered, and either an existing sitemap.xml or content stored in AWS DynamoDB that can be used to create the sitemap. AWS credentials will be needed in the latter case.
Update the appsettings.json file with details of the prerequisites above. You can alter the Program.cs steps as needed for your own solution. For example, you can skip the steps for loading site content from DynamoDB to generate the sitemap if you already have one and only need to render.
To run this as part of a CI/CD pipeline, you will need to host this console application on a machine also running Chrome. The Program.cs could be adapted to run on a continuous loop or to be triggered as needed, possibly as a scheduled task.
The following links proved useful on the journey to creating this solution. They provided a guide as to what I did and did not want to do in my solution.
Eric Lu - SEO for AngularJS on S3
Prerender.io - Middleware you install on a server to check if requests come from a crawler.
Zanon - Provided an introduction to PhantomJS which ultimately led to Puppeteer.
StackOverflow - Thread discussed using Prepender.io in the context of AWS Lambda.
public static async Task<List<Route>> GetContent(List<Route> routes, bool exportHtml = false, AmazonS3Client s3Client = null)
{
try
{
var result = new List<Route>();
var options = new LaunchOptions
{
Headless = true,
ExecutablePath = ChromeExePath
};
using (var browser = await Puppeteer.LaunchAsync(options))
{
foreach (var route in routes)
{
Console.WriteLine($"Loading page: {route.Path}");
using (var page = await browser.NewPageAsync())
{
var response = await page.GoToAsync(route.Path, new NavigationOptions
{
WaitUntil = new WaitUntilNavigation[]
{
WaitUntilNavigation.Networkidle0
}
});
route.Content = await page.GetContentAsync();
if (!string.IsNullOrEmpty(route.Content))
{
route.Content = Format(route.Content);
}
result.Add(route);
if (exportHtml && !string.IsNullOrEmpty(route.Content) &&
route.Uri != null && route.Query != "//" && !route.Query.Contains("sitemap"))
{
Console.WriteLine($"Exporting page: {route.Path}");
await FileManager.Export(route);
}
}
}
}
return result;
}
catch (Exception ex)
{
Console.WriteLine($"Render error: {ex.Message}. {ex.StackTrace}.");
return null;
}
}
Tagged: #code
Posted on Nov 05, 2018