Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the limitations/requirements to execute csharp code in notebook? #3534

Open
hjy1210 opened this issue Apr 28, 2024 · 4 comments
Open
Labels
question Further information is requested

Comments

@hjy1210
Copy link

hjy1210 commented Apr 28, 2024

The package and version I'm asking about:

Polyglot Notebooks v1.0.5208010

Question

What is the limitations/requirements to execute csharp code in notebook?

I can execute a simple .net 8.0 csharp console app correctly in VS 2022.
But when copy the code to notebook, error occured when executing. What is missing when ported to notebook?

The code in notebook is as bellow:

#r "nuget:itext7"
#r "nuget:itext7.font-asian"

using iText.Kernel.Pdf.Canvas.Parser.Listener;
using iText.Kernel.Pdf.Canvas.Parser;
using iText.Kernel.Pdf;
using System.Text;

string ExtractText(string filePath)
{
        var pdfReader = new PdfReader(filePath);
        var pdfDoc = new PdfDocument(pdfReader);
        StringBuilder sb = new StringBuilder();
        for (int i = 1; i <= pdfDoc.GetNumberOfPages(); i++)
        {
            var page = pdfDoc.GetPage(i);
            LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
            sb.AppendLine(PdfTextExtractor.GetTextFromPage(page, strategy));
        }
        pdfDoc.Close();
        var data = sb.ToString();
        return data;
 }
Console.WriteLine(ExtractText(@"c:\lucenedata\documentsroot\2007-1.pdf"));
Console.WriteLine("Press any key to close app");
Console.ReadKey();

the error message appeared as:

Error: iText.IO.Exceptions.IOException: The CMap iText.IO.Font.Cmap.UniCNS-UTF16-H was not found.
at iText.IO.Font.Cmap.CMapLocationResource.GetLocation(String location)
at iText.IO.Font.Cmap.CMapParser.ParseCid(String cmapName, AbstractCMap cmap, ICMapLocation location, Int32 level)
at iText.IO.Font.Cmap.CMapParser.ParseCid(String cmapName, AbstractCMap cmap, ICMapLocation location)
at iText.IO.Font.CjkResourceLoader.ParseCmap[T](String name, T cmap)
at iText.IO.Font.CjkResourceLoader.GetUni2CidCmap(String uniMap)
at iText.Kernel.Font.FontUtil.GetToUnicodeFromUniMap(String uniMap)
at iText.Kernel.Font.PdfType0Font..ctor(PdfDictionary fontDictionary)
at iText.Kernel.Font.PdfFontFactory.CreateFont(PdfDictionary fontDictionary)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.GetFont(PdfDictionary fontDict)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.SetTextFontOperator.Invoke(PdfCanvasProcessor processor, PdfLiteral operator, IList`1 operands)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.InvokeOperator(PdfLiteral operator, IList`1 operands)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.ProcessContent(Byte[] contentBytes, PdfResources resources)
at iText.Kernel.Pdf.Canvas.Parser.PdfCanvasProcessor.ProcessPageContent(PdfPage page)
at iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(PdfPage page, ITextExtractionStrategy strategy, IDictionary`2 additionalContentOperators)
at iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor.GetTextFromPage(PdfPage page, ITextExtractionStrategy strategy)
at Submission#3.ExtractText(String filePath)
at Submission#4.<<Initialize>>d__0.MoveNext()
--- End of stack trace from previous location ---
at Microsoft.CodeAnalysis.Scripting.ScriptExecutionState.RunSubmissionsAsync[TResult](ImmutableArray`1 precedingExecutors, Func`2 currentExecutor, StrongBox`1 exceptionHolderOpt, Func`2 catchExceptionOpt, CancellationToken cancellationToken)

Following is the pdf file appeared in the code.
2007-1.pdf

@hjy1210 hjy1210 added the question Further information is requested label Apr 28, 2024
@jonsequitur
Copy link
Contributor

This might be an issue with this specific package. Do you happen to know what location it's looking for? For example, if it's looking in a build output location, it won't find it, since there's no build output for the C# Script.

Unrelated to the exception, Console.ReadLine won't work in the notebook. Input gestures are documented here: https://github.com/dotnet/interactive/blob/main/docs/input-prompts.md

@hjy1210
Copy link
Author

hjy1210 commented Apr 30, 2024

@jonsequitur
About Do you happen to know what location it's looking for?
What does it mean?

@jonsequitur
Copy link
Contributor

jonsequitur commented Apr 30, 2024

I was referring to this from your exception details:

Error: iText.IO.Exceptions.IOException: The CMap iText.IO.Font.Cmap.UniCNS-UTF16-H was not found.
at iText.IO.Font.Cmap.CMapLocationResource.GetLocation(String location)

My guess is that this is a file in the package that the build would normally copy to the build output (in a normal C# project build). The code is probably looking for this file in that location. But C# Script doesn't do a build and so the file isn't in the expected location (but it is in the NuGet package cache).

This would be something that this package would need to account for in order to work correctly in C# Script / .NET Interactive.

@hjy1210
Copy link
Author

hjy1210 commented Apr 30, 2024

@jonsequitur
The Visual Studio C# project build output directory contains following files, once click the execution file RxNetPuzzle.exe, the program executed as expected.

I still do not know how to fix the problem, thanks for your time.

itext.barcodes.dll
itext.bouncy-castle-connector.dll
itext.commons.dll
itext.font_asian.dll
itext.forms.dll
itext.io.dll
itext.kernel.dll
itext.layout.dll
itext.pdfa.dll
itext.pdfua.dll
itext.sign.dll
itext.styledxmlparser.dll
itext.svg.dll
Microsoft.DotNet.PlatformAbstractions.dll
Microsoft.Extensions.DependencyInjection.Abstractions.dll
Microsoft.Extensions.DependencyInjection.dll
Microsoft.Extensions.DependencyModel.dll
Microsoft.Extensions.Logging.Abstractions.dll
Microsoft.Extensions.Logging.dll
Microsoft.Extensions.Options.dll
Microsoft.Extensions.Primitives.dll
Newtonsoft.Json.dll
RxNetPuzzle.deps.json
RxNetPuzzle.dll
RxNetPuzzle.exe
RxNetPuzzle.pdb
RxNetPuzzle.runtimeconfig.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants