1 year ago

#376027

test-img

happykratos

Can not Load HTML doc from docker container

I wrote a simple scraper that takes a URL and scrapes just one Xpath from that document. In the local machine, it works perfectly fine, but after building as a docker image and running up in a docker container, it returns a null value. It's because it can not load that HTML page anymore.

var web = new HtmlWeb();
            var doc = web.Load(_url);
            var HeaderNames = doc.DocumentNode.SelectNodes("//*[@id='productTitle']").FirstOrDefault();
            return HeaderNames.InnerText.Trim();

I'm using HTML-agility-pack if it does matter. How can I solve this problem? or has anyone had a similar problem?

also my docker image

    FROM mcr.microsoft.com/dotnet/aspnet:6.0-focal AS base
WORKDIR /app
EXPOSE 8080

ENV ASPNETCORE_URLS=http://+:8080
RUN adduser -u 5678 --disabled-password --gecos "" appuser && chown -R appuser /app
USER appuser

FROM mcr.microsoft.com/dotnet/sdk:6.0-focal AS build
WORKDIR /src
COPY ["Scraper.csproj", "./"]
RUN dotnet restore "Scraper.csproj"
COPY . .
WORKDIR "/src/."
RUN dotnet build "Scraper.csproj" -c Release -o /app/build

FROM build AS publish
RUN dotnet publish "Scraper.csproj" -c Release -o /app/publish /p:UseAppHost=false

FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "Scraper.dll"]

c#

html

docker

html-agility-pack

0 Answers

Your Answer

Accepted video resources