Quantcast
Channel: PDFTron Blog
Viewing all 20 articles
Browse latest View live

Mobile Cross-Platform PDF Viewers: Options for Android, iOS, Windows Store Apps and Windows Phone 8

$
0
0

The rise of mobile platforms, each with its own native programming language and API, has created new demand for cross-platform development tools and SDKs. To display a PDF, most cross-platform toolkits offer either a C++ interface (which do not provide a native UI component) or might be a simple PDF-to-image style solution. In this post, we will outline some better options for handling PDFs in a cross-platform manner on mobile devices.

Native SDKs:  4 Platforms, 4 Languages, 1 API

The first option is to use our native SDK that provides a Java interface on Android, an Objective-C interface on iOS and a C#/C++/CX/JavaScript interface for Windows Store Apps and Windows Phone apps. This might not immediately sound very “cross-platform”, but in a sense it is because our library has the same API on all our platforms. So, for example, here is code for creating a new PDF document, adding a page, and then displaying the page count, on three different platforms. Notice that aside from platform-specific differences, the code is identical.

Android (Java):

PDFNet.initialize();

PDFDoc myDoc = new PDFDoc();
PDFRect letterPageRect = new PDFRect(0, 0, 612, 792);
Page page = myDoc.pageCreate(letterPageRect);
myDoc.pagePushFront(page);
int numPages = myDoc.getPageCount();

System.out.println("My document has " + numPages + " pages.");

Windows Store Apps and Windows Phone (C#):

PDFNet.Initialize();

PDFDoc myDoc = new PDFDoc();
PDFRect letterPageRect = new PDFRect(0, 0, 612, 792);
Page page = myDoc.PageCreate(letterPageRect);
myDoc.PagePushFront(page);
int numPages = myDoc.GetPageCount();

System.Diagnostics.Debug.WriteLine("My document has {0} pages.", numPages);

Note that PDFNet is also compatible with C++/CX and JavaScript for Windows Store Apps.

iOS (Objective-C):

[PDFNet Initialize:@""];

PDFDoc* myDoc = [[PDFDoc alloc] init];
PDFRect* letterPageRect = [[PDFRect alloc] initWithX1:0 y1:0 x2:612 y2:792];
Page* page = [myDoc PageCreate:letterPageRect];
[myDoc PagePushFront:page];
int numPages = [myDoc GetPageCount];

NSLog(@"My document has %d pages.", numPages);

Native Cross-platform: Xamarin.iOS and Xamarin.Android

Of course having the same PDF API across platforms helps with the PDF processing portion of an app, but some developers want to extend the same principle to the entire app. A prominent company that helps with this is of course Xamarin.

Xamarin’s products Xamarin.iOS and Xamarin.Android allow developers to create apps for Android and iOS using C#, and to use the same non-UI code on both platforms. Xamarin has a system for “wrapping” native libraries such as PDFNet so that they can be directly accessed via C# in a Xamarin app. Wrapping a library can be a time consuming and at times tricky process, so we have prepared a pre-wrapped version of PDFNet for Xamarin.Android and Xamarin.iOS which you can download here.

Hybrid Apps: PhoneGap, Cordova and Appcelerator Titanium

PhoneGap (and its underlying technology Cordova), and Appcelerator Titanium take a different approach to cross-platform development by wrapping an HTML5 web app in a native container. This way the app developer writes a single HTML5 app using HTML/CSS/JavaScript which is hosted in a native web view on each target platform.  These are called hybrid apps. The native portion of the app is mostly responsible for simply showing a web page which runs the actual HTML5 app. However because the web page is hosted by a native app, it can be sold in an app store just like any other app.

So how can you add PDF viewing to a hybrid app?

1. Direct Native Integration

PhoneGap and Appcelerator Titanium provide a bridging interface so that native libraries can be integrated into these apps. With PhoneGap, you need to write a “plug-in”, with the process detailed on PhoneGap’s website for Android and iOS.

On Appcelerator Titanium, integrating with native code involves writing a “module”, for which the process is outlined on their webpage for both Android and iOS.

2. HTML5 Webviewer

Another approach to cross platform document viewing is to dispense with the native control and in its place use PDFTron’s WebViewer,  a pure-HTML5 document viewer. The WebViewer functions by using PDF documents that have been converted to a web-optimized XPS format that preserves the font and vector content found in a PDF, and then presents this document in a mobile, multi-touch friendly viewer.

Here are some pros and cons to consider when evaluating using a native control vs. the HTML5 WebViewer within a hybrid app:

PDFNet Native Pros: Fast; works directly on any PDF.
PDFNet Native Cons: Requires the inclusion of a multi-megabyte library; requires more work to integrate.

WebViewer Pros: Simple integration for all platforms; extremely lightweight; lower licensing cost.
WebViewer Cons: Does not have all of the PDF processing capabilites that are available in PDFNet; documents need to be optimized to guarantee fast viewing across devices.

We have a sample integration with Cordova on our Webviewer sample page.

Custom Hybrid Apps

Developers also have the option of writing custom hybrid apps without relying on PhoneGap/Appcelerator, etc. This may offer some additional flexibility or performance over using one of the toolkits. As when using the hybrid app toolkits, the native library (PDFNet) or the WebViewer could be used, with the same pros and cons listed above applying. The Webviewer web page has sample apps for Android, iOS and WinRT. We will also be writing a tutorial blog post about custom hybrid apps, so stay tuned!

Pure Web Apps

Lastly we come to writing a pure web app, that is a web site that looks and acts like an app but is only accessible via the web using a standard browser. These apps can provide document viewing using the WebViewer, a pure HTML5 component that will work on all modern mobile platforms. As stated above, the WebViewer is optimized for mobile viewing, including using swipe gestures to turn pages and pinch-to-zoom.



PDF SDK bindings for PHP, Python, and Ruby

$
0
0

PDFTron open sources PDFNet language bindings for PHP, Python, and Ruby.

We are happy to announce that the source code for the PHP, Python, and Ruby bindings for PDFNet are now open source.

This is in response to many PDFNet customers needing support for specific language versions. Previously, PDFTron’s pre-compiled bindings supported only specific versions of these languages (specifically PHP 5.3, Python 2.7/3.2, and Ruby 1.9). It was clear that PDFNet users required more flexibility to support different languages, versions, and system configurations.

By open sourcing PDFNet SWIG bindings, PDFTron clients are now in full control over programming languages, platforms, and environments used in their solutions.

Besides using PDFNet with PHP, Python, and Ruby, the provided interface files can be used as a template to interface PDFNet with other programming languages (such as Lua, Go, Mono, D, Perl, R, Ocaml, and other languages supported by SWIG).

To get started, please follow the instructions provided in the README.txt. We welcome community feedback, pull requests, and additional language extensions.

PDFNet SWIG Project on GitHub: https://www.github.com/PDFTron/PDFNetWrappers


Table extraction and PDF to XML with PDFGenie

$
0
0

Intro

PDF is a hugely popular format, and for good reason: with a PDF, you can be virtually assured that a document will display and print exactly the same way on different computers. However, PDF documents suffer from a drawback in that they are usually missing information specifying which content constitutes paragraphs, tables, figures, header/footer info etc. This lack of ‘logical structure’ information makes it difficult to edit files or to view documents on small screens, or to extract meaningful data from a PDF. In a sense, the content becomes ‘trapped’. In this article we discuss the logical structure problem and introduce PDFGenie, a tool for extracting text and tables, as well as establishing a ground truth for evaluating progress in this area by PDFGenie as well as other tools.

Why is PDF so popular and what is its Achilles’ heel?

After HTML, PDF is by far one of most popular document formats on the Web. Google stats show that PDF is used to represent over 70% of the non-html web. These are just the files that Google has indexed. There are likely to be many more in private silos such as company databases, academic archives, bank statements, credit card bills, material safety data sheets, product catalogues, product specifications, etc.

One of the main reasons why PDF is so popular is that it can be used for accurate and reliable visual reproduction across software, hardware, and operating systems.

To achieve this, PDF essentially became the ‘assembly language’ of document formats. It is fairly easy to ‘compile’ (i.e. convert) other document formats to PDF, but the reverse (i.e. decompiling PDF to a high-level representation) is much more difficult.

As a result, most PDF documents are missing logical structures such as paragraphs, tables, figures, header/footers, the reading order, sections, chapters, TOC, etc.

Although PDF could technically be used to store this type of structured information via marked content, it is usually not present. When available, techniques similar to one shown in the LogicalStructure sample can be used to extract structured content.

Unfortunately, even when a file contains some tags, they are frequently not very useful because there is no universally accepted grammar for logical structure in documents (just like there is no universally accepted high-level programming language). Tags are also frequently incorrect or damaged due to file manipulation or errors in PDF generation software.

The lack of structural information makes it difficult to reuse and repurpose the digital content represented by PDF.

So, although massive amounts of unstructured data are held in the form of PDF documents, automated extraction of tables, figures, and other structured information from PDF can be very difficult and costly.

How difficult is it to extract a table from PDF?

Based on 15+ years of experience developing PDF toolkits for developers, we can attest that there is a profound lack of appreciation for the complexity of the problem. Interestingly this view often comes from developers and even technology experts. Similar to David Marr who planned to solve Computer Vision as a summer project or Knuth giving his student a small side project (to become Tex), it is fairly hard and counter-intuitive to appreciate the full scope of what is required to get computers to understand documents.

Just like in case of OCR, there is no ‘perfect’ solution. So, for example, it is not difficult to run into cases where humans segment and label parts of a document in widely different ways. Is the text column an article, or a column, or is it part of a table? Sometimes you need to understand the content semantic to decide, at other times multiple interpretations are possible. In the figure below, do you see a face or a vase?

7798292

There are multiple annual conferences and hundreds of papers and doctorates published every year on the topic. The pace of research is only increasing and the problem is still far from being cracked.

Some of our users say “but I am already using TextExtractor, or pdf2text, or solution X. These seem to work fine for me”.

These tools return precise positioning information for each character and can build simple segmentations: e.g. word > line > block > column. Most commercial solutions are tweaks of segmentation algorithms developed in the 80’s. They use a bunch of efficient bottom-up heuristics with hard-coded thresholds. Small but inevitable errors tend to propagate and cause serious issues down the line. For this reason most existing solutions usually produce very shallow structure (e.g. ‘paragraphs’).

What is the best tool to extract structure from PDF?

Unfortunately if you are looking for anything a bit more complicated, even as simple as table extraction, it is likely that you will be eventually disappointed regardless of the tool. For example, in some cases text may be jumbled, cells may be glued together or over segmented; table boundary may be incorrect etc.

It doesn’t matter what tool you use, they will all under-perform because they are all based on the same paradigm. There are certainly some promising approaches in machine learning that may improve the situation in years to come (yes we working on this), however it remains to be seen and will not happen overnight.

Before we see the next generation of tools, we will first need to figure out how to rate them in terms of recognition quality/accuracy.

Unfortunately, because there is no common test suite and no common ground truth, currently it is impossible to objectively compare and evaluate algorithms/solutions. Different recognition engines will inevitably represent higher-level document structures differently. Different engines will also optimize or bias towards certain ‘types/grammars’ or ‘classes’ of documents which also complicates comparisons. Like document recognition itself, the evaluation of performance of recognition algorithms is also an active area of research without clear or effective solutions.

PDF Liberation Hackathon

Without a scientific approach to quantify the performance of different algorithm/solutions, how can we compare solutions, make recommendations, or advance the state of the art in document recognition? How can we dream about liberating PDF content?

With this in mind, we were looking with some anticipation at the ‘PDF Liberation Hackathon’ that ran between Jan. 17-19, 2014.

Although it is possible that the event helped with document processing in some niche workflows it would be hard to say that it liberated PDF content. Perhaps the main issue may be with the event name, which may be further entrenching the false idea that PDF content can be ‘liberated through overnight hacking’. It is also unclear how helpful any software recommendations would be without objective evaluation or a proper benchmark.

At the same time the event was quite effective at getting people to start thinking about document recognition. Hopefully this will lead to development of common test suites and a ground truth that can be used as a starting point for evaluation of different solutions.

PDFGenie

On our end, the event was a catalyst to showcase the second generation table recognition technology we were working on over last few years. Although the technology was available for some time as a PDFNet Add-on, it is now available as a simple to use command-line tool called PDFGenie.

PDFGenie can extract tables, text, and reading order from existing PDF documents in the form of HTML or XML output.

There is no special installation required. After you unzip the archive you are ready to go. For example: > pdfgenie my.pdf  will convert

Chicago5

to the following HTML.

For a list of available options type pdfgenie -h

Evaluating accuracy of PDF table recognition

A particularly useful option offered by PDFGenie is -x (or –xfdf). This option produces XFDF (XML Forms Data Format). When XFDF resides in the same folder as the original PDF opening XFDF will load an annotation layer on top of the underlying/original PDF as shown in the image below:

Chicago5_annotated

In this case annotations are used to visually highlight regions with tables and other structures. The annotated PDF can be used for quick visual validation of document recognition.

The same option could be useful when generating ground truth (i.e. when you want to define the output you would expect to see). Instead of having to manually tag hundreds of documents, PDFGenie could generate initial labels and any PDF annotator could be used to manually correct misclassified regions.

The primary advantage of annotations is that they are completely decoupled from the content stream and are therefore easier to manipulate compared to ‘tagged PDF’.

Using annotations to label PDF regions has some limitations (e.g. disjoint regions may overlap or nest) compared to marked content (or ‘Tagged PDF’), however we found that these cases are relatively rare and could be worked around by adding extra properties to annotation dictionaries.

Along with the input PDF, PDFGenie can accept a ground truth XFDF file and can compute the error rate and other statistics that can be used to evaluate deviation from ideal output. This is a sample report generated with -r (or –report) option. In this case all stats are perfect because the input XFDF is exactly the same as the one generated with -x option. If we modified the annotation to correct for error, the statistics would be less than perfect.

Towards common evaluation framework for document recognition

The above discussion regarding XFDF, ground truth, and automated verification is not only related to PDFGenie.

For example, XFDF (which is an upcoming ISO standard) could be used a format to label a common ground truth repository independent from any particular document recognition engine. This repository could be used as a measurement stick for further advances in document recognition.

Rather than trying to liberate PDF content by focusing on specific solutions (either commercial or open source), perhaps a more effective approach would be development of a truly open ground truth repository along with corresponding benchmark and evaluation framework?

In the long run, it is unlikely that development of a specific document recognition engine will revolutionize the field, however a community-driven framework for scientific evaluation of different solutions does have a potential to make a big impact.

Even though this project is manageable, it is unlikely going to happen as result of a hackathon or a similar event. To make it successful, the project would require focus and financial backing that usually comes from some sort of organization.

If the above resonates with you and you would like to join the project please let us know.


iOS Tools V 6.2

$
0
0

The way tool library callbacks work has changed for better transparency and customization. Updating existing code to work with the new PDFViewCtrl should only take a couple of minutes and is described here.

Previously, PDFViewCtrl had a tool property that would be sent events and would be destroyed and replaced by new tools by the control when required. The new system moves the tools destruction/creation logic outside of the control so that it can be accessed and modified by users of the SDK. In 6.2.0, instead of directly setting a tool on the PDFViewCtrl, a “ToolManager” is registered to receive events, and it is the ToolManager’s job to then pass these events to the actual tools as well as destroy and create new tools when necessary.

So before the first tool would have been registered as so:

// assign the initial tool as a PanTool
 [myPDFViewCtrl setTool:[[PanTool alloc] initWithPDFViewCtrl:myPDFViewCtrl];

This is now done as so:

// create a tool manager
_toolManager = [[ToolManager alloc] initWithPDFViewCtrl:myPDFViewCtrl];

// set the initial tool to be a pan tool
[_toolManager changeTool:[PanTool class]];

// set the tool manager to receive the events from the PDFViewCtrl
myPDFViewCtrl.toolDelegate = _toolManager;

Subsequent changes to the active tool can be made within the app like this:

[_toolManager changeTool:[ADifferentTool class]];
or
[_toolManager setTool:[[ADifferentTool alloc] initWithPDFViewCtrl:myPDFViewCtrl]];

The source code for the new (default) ToolManager can be found in the Tools project, located at Lib/src/PDFViewCtrlTools/Tools/ToolManager.m.


A simple example of converting PDF to HTML

$
0
0

We have received lots of interest in our new PDF to HTML/EPUB conversion since it was released in PDFNet 6.0. With this interest we have also gotten questions on customizing the output. So today I’ll provide a quick demo of converting a PDF to HTML using PDFNet.

In another post I will go into some of the details particular to PDF to EPUB conversion, but everything in today’s post applies to both HTML and EPUB output.

Furthermore, while PDFNet is available in C/C++, Java, Objective-C, Python, Ruby, PHP, VB and C#, due to its popularity I decided to do this demo in C#. The PDFNet API is consistent enough that you should be able to easily translate to another language.

Setup

First, download PDFNet from our download page.
http://www.pdftron.com/pdfnet/downloads.html
For this demo I downloaded PDFNet for Windows Desktop .Net 4+. But you can just as easily download any of our desktop versions (including Linux and Mac).

After unzipping the download, navigate to the Samples folder, and select one of the Visual Studio solutions. For me, I chose Samples_2013.sln.

Once in Visual Studio, right click the ConvertTestCS2013 project and select Set as Startup Project.

For this demo, we will simulate the following requirements:

  • Convert only odd number pages.
  • Target iOS devices
  • High image quality (DPI)
  • Use PNG instead of JPG
  • No HTML hyperlinks to URL’s outside of the document.

Since we are targeting iOS, a quick look at Apple’s official Safari iOS resource limits shows we want to have a 3 megapixel (MP) limit. We will also crank up the DPI so the output looks as good as possible on a retina display.
Safari Web Content Guide

Code

Here then is the code to accomplish the above.

using (PDFDoc doc = new PDFDoc(inputPath + "newsletter.pdf"))
{
    doc.InitSecurityHandler();
    // remove all even pages
    if(doc.GetPageCount() > 1)
    {
        PageIterator itr = doc.GetPageIterator();
        itr.Next(); // skip first page
        while (itr.HasNext())
        {
            doc.PageRemove(itr); // remove even pages
            itr.Next();
        }
    }
    pdftron.PDF.Convert.HTMLOutputOptions options = new pdftron.PDF.Convert.HTMLOutputOptions();
    options.SetInternalLinks(true);
    options.SetExternalLinks(false);
    options.SetPreferJPG(false);
    options.SetDPI(300);
    options.SetMaximumImagePixels(3000000);
    options.SetSimplifyText(true);
    options.SetScale(2.0);
    pdftron.PDF.Convert.ToHtml(doc, outputPath + "newsletter_odd_pages", options);

What does all the code above mean?

After initializing the library, and opening the document, we first modify the document in memory by removing the even numbered pages. As long as we do not call PDFDoc.Save(), then these changes do not affect the original source file.

Tip: There are lots of more code example’s showing how to use PDFNet, available in the downloaded samples, and on our forum.
www.pdftron.com/pdfnet/samplecode.html
https://groups.google.com/forum/?fromgroups#!forum/pdfnet-sdk

PDF to HTML Options

Now onto the PDF to HTML code.

options.SetInternalLinks(true);
options.SetExternalLinks(false);

Above we make sure internal links are enabled, which ensures that any internal links in a PDF are included in the HTML, for example a table of contents. The next line though disables any links that would take the reader outside of the document, such as another website.

options.SetPreferJPG(false);
options.SetDPI(300);
options.SetMaximumImagePixels(3000000);

Next, we turn on PNG image output, increase the image DPI to 300, but set a 3 MP limit so as not to overload iOS device. The result will be that PNG’s will be generated at 300 DPI, except where that would put the image over 3MP. In the latter case, the image will be down-sampled to the highest DPI that will keep it under 3MP.

options.SetSimplifyText(true);

Here, we enable text optimization. This attempts to merge text runs in the PDF file, to reduce HTML DOM complexity, and reduce HTML file size. This can result in text placement not matching exactly what was in the PDF, but to the human eye it is typically not noticeable, even when viewing the output side by side with the original. On the other hand, it will reduce download, layout, and rendering times.

options.SetScale(2.0)

Finally, we will scale the html output so that it is easier to read in the browser, without having to rely on the browser to zoom.

DocPub CLI

For those that prefer command line tools, here is how you would get the same output using our DocPub command line tool.

docpub.exe -f html --internal_links --prefer_jpg false --dpi 300 --max_image_pixels 3000000 --simplify_text --scale 2.0 input.pdf

You can download DocPub for Windows, Mac and Linux from here.
http://www.pdftron.com/docpub/downloads.html

Conclusion

I hope that you find this information useful, and that you give PDF to HTML conversion a test drive soon, and stay tuned for more information on creating EPUB’s!

References:

High Quality EPUB / HTML From PDF : http://blog.pdftron.com/2013/11/15/high-quality-epub-html-from-pdf/

 


PDFTron PDFNet for .Net Now Available From Nuget.org

$
0
0

Recently we added PDFNet to the Nuget gallery. For our .Net users this is a great new way to keep up on the latest PDFNet releases.

www.nuget.org/packages/PDFNet/

This package includes all of our .Net releases, for .Net 2.0 – 3.5 and .Net 4.0 +, and the corresponding 32 and 64 bit assemblies. When you install the package it of course picks the appropriate version and sets everything up for you.

A common pain that our .Net users have had, was using both our 32 and 64 bit assemblies in an AnyCPU project. While an installer would normally take care of this, it was still frustrating for developers. To address this issue, our Nuget package includes a new assembly, PDFNetLoader, that at runtime will pick the correct version of PDFNet. The source code for the loader is available on GitHub, and the latest binaries here.

If you install the nuget package, then all of this is setup for you, and before calling any PDFNet methods, simply insert the following line of code, in the main class of your project.

private static pdftron.PDFNetLoader loader = pdftron.PDFNetLoader.Instance();

You can also change the load path for the PDFNet assemblies, by appending the code above with

Path("path_to_pdfnet_folder_containing_32_and_64_bit_PDFNet_libs")

For example

private static pdftron.PDFNetLoader loader = pdftron.PDFNetLoader.Instance().Path(@"..\res");

Technical Details

On the other hand, if you are already using PDFNet, or don’t want to use the Nuget package, here are the manual steps to setup your project to use PDFNet and AnyCPU.

First let us assume your output dir (where your exe is created) is ‘bin’.

  1. Create a folder in ‘bin’ called ‘PDFNet’
  2. In ‘PDFNet’ create two folders, one called ‘x64′, the other ‘x86′
  3. Place the PDFNet.dll for x86 and x64 in the respective folders above.
  4. Download the latest PDFNetLoader.dll : github.com/PDFTron/PDFNetLoader/releases
  5. In your Visual Studio project add PDFNetLoader.dll as a reference.
  6. In your Visual Studio project add bin/PDFNet/x86/PDFNet.dll as a reference.
  7. Right click the PDFNet.dll (not PDFNetLoader.dll) in your references, and select Properties.
  8. Change the “Copy Local” property to “False”
  9. Somewhere in your code, as early as possible, and before any PDFNet methods are called, add the following line.
private static pdftron.PDFNetLoader loader = pdftron.PDFNetLoader.Instance();
You are now ready to run your project in AnyCPU.

The project references to PDFNet and PDFNetLoader allow you to code normally. At runtime though, the following will happen.

  1. The PDFNetLoader.Instance() method is triggered, which registers itself as an event handler for AppDomain.AssemblyResolve.
  2. JIT encounters a method that uses PDFNet and tries to load PDFNet.dll, but because of the “local copy:false” this fails.
  3. PDFNetLoader gets the AssemblyResolve event, detects if system is 32 or 64 bit, and loads the corresponding PDFNet.dll.

We hope our .Net users find this useful, and as always you can ask questions in PDFNet Forum on Stackoverflow.


Using PDFNet From a Swift Project on iOS

$
0
0

With Xcode 6, Apple introduced a brand new programming language, Swift. PDFNet is easy to use with a Swift project. This post will show how to set up a new Swift project and display a PDF.

This post will show you the basics of starting a new Swift project that uses PDFNet. For a comprehensive introduction to PDFNet on iOS, please see Getting Started on iOS. A completed Swift sample project is included with the most recent version of PDFNet for iOS.

1. Open Xcode 6.0.1+, and select File->New Project->Single View Application. Create a new Universal Swift app.

2. Add the following PDFNet files to your project:

  • Lib/PDFNet.framework
  • Lib/libTools.a
  • Lib/pdfnet.res
  • Lib/NSObjectInitWithCptr.h
  • Lib/NSObjectInitWithCptr.m
  • Lib/src/PDFViewCtrlTools/Tools/ToolManager.h
  • Lib/src/PDFViewCtrlTools/Tools/PanTool.h
  • Lib/src/PDFViewCtrlTools/Tools/Loupe.png
  • Lib/src/PDFViewCtrlTools/Tools/Loupe@2x.png
  • TestFiles/mech.pdf

3. Link in the following Apple frameworks

  • MediaPlayer.framework
  • CoreText.framework
  • CoreMedia.framework

4. Change the extension of Lib/NSObjectInitWithCptr.m to .mm (so that the C++ library will be linked in).

5. Add a new header called BridgingHeader.h and change its contents to the following:

#ifndef SwiftSample_BridgingHeader_h
#define SwiftSample_BridgingHeader_h

#import <PDFNet/PDFViewCtrl.h>
#import <PDFNet/PDFNetOBJC.h>
#import "PanTool.h"
#import "ToolManager.h"

#endif

6. Identify the bridging header as such in the project settings. In the Swift Compiler – Code Generation section, set Objective-C Binding Header to the path to BridgingHeader,h.

BridgingHeader

7.  In the file ViewController.swift, change the viewDidLoad() function to the following:

override func viewDidLoad() {
super.viewDidLoad()

// Do any additional setup after loading the view, typically from a nib.
// initialize PDFNet
PTPDFNet.Initialize("")

// set the resource file
let resourceFilePath = NSBundle.mainBundle().pathForResource("pdfnet", ofType: "res")
PTPDFNet.SetResourcesPath(resourceFilePath!)

// create a PTPDFViewCtrl and add it to the view
let ctrl = PTPDFViewCtrl()
ctrl.frame = CGRect(origin: CGPointZero, size: self.view.bounds.size)
ctrl.SetBackgroundColor(180, g: 180, b: 180, a: 255)
self.view.addSubview(ctrl)

// open the PDF document included in the bundle in the PTPDFViewCtrl
let docPath = NSBundle.mainBundle().pathForResource("mech", ofType: "pdf")
let doc = PTPDFDoc(filepath: docPath!)
ctrl.SetDoc(doc)

// add the toolmanager (used to implement text selection, annotation editing, etc.
let toolManager = ToolManager(PDFViewCtrl: ctrl)
ctrl.toolDelegate = toolManager;
toolManager.changeTool(PanTool)
}

If you run the project, the program should now display the PDF mech.pdf. As stated above, a completed version of this sample is available in the latest version of PDFNet, so please check it out!


PDFNet for .Net and AnyCPU

$
0
0

In a previous post we announced the availability of PDFNet in Nuget, and also discussed how we made PDFNet work with the AnyCPU configuration.

Feedback for the AnyCPU setup has been great, but our users still had some technical questions on usage, that we would like to answer here.

Overview

Feel free to skip straight to the Breakdown section if you are in a rush, otherwise please read on.

One of the great advantages of PDFNet is that it is truly cross platform, and our users can easily add PDF support and related business logic to their work flows, and applications, across many different mobile and desktop environments, using essentially the same interface.

One complication of this though is that a native code library must be loaded and for this post we will be discussing .Net and Windows desktop.

For PDFTron to compile a native code library to be compatible with .Net, the Common Language Runtime (CLR), which allows us to create a native code assembly that is usable by C# and VB projects. However, CLR projects cannot be AnyCPU, an architecture needs to be selected (x64, x86, ARM, etc), and the assembly must dynamically link to the C++ Runtime, which means any CLR assembly depends on the VC++ Redistributables.

Furthermore, PDFNet for .Net comes in two versions, one for clients using .Net 2.0-3.5 and another for .Net 4+. These two versions are built using different Visual Studio toolsets, and therefore have different C++ runtime version dependencies; VC++ 2008 and VC++ 2010 respectively.

Breakdown

So with the above information, we have the following breakdown of Individual PDFNet for .Net assemblies.

Download Architecture .Net Framework VC++ Redistributable
PDFNet x86 2.0 – 3.5 VC++ 2008 Redistributable x86
PDFNet x64 2.0 – 3.5 VC++ 2008 Redistributable x64
PDFNetDotNet4 x86 4.0 + VC++ 2010 Redistributable x86
PDFNetDotNet4 x64 4.0 + VC++ 2010 Redistributable x64

VC++ Redistributable

None of the PDFNet downloads, including our Nuget package, include any of redistributable libraries. This is typically not noticeable on the developers computer, as these are included when Visual Studio is installed.

However, when it comes time to distribute your program to a server, or clients, machine, then you will probably need to provide the redistributable.

The links above for the VC++ Redistributables contain the MSVCP[90|100].dll and MSVCR[90|100].dll.

AnyCPU

So how does a .Net developer use PDFNet with the AnyCPU configuration, considering the above conditions?

Essentially, we need to delay the assembly loading until runtime, when it is known what the actual architecture is. To facilitate this we created the PDFNetLoader assembly. This is included in the download for PDFNet, and if look at the sample projects you will see they are all set to AnyCPU.

If you are interested, the source code for PDFNetLoader is available on our GitHub page.

Configuring an Existing Project

So how can you configure your existing project to use PDFNet and AnyCPU? First, download the PDFNet zip from our site.

PDFNet .Net 2.0 – 3.5
PDFNet .Net 4.0+

Next, let us assume your output dir (for example where your exe is created) is called ‘Bin’.

  1. Extract the contents of the PDFNet zip archive you downloaded above.
  2. Copy the ‘Lib’ folder in the PDFNet package to the same folder as your Visual Studio project.
  3. In your Visual Studio project select Properties and Build Events.
  4. In Build Events add the following text into the Post-build event command line window.
    xcopy $(ProjectDir)\Lib\PDFNet $(TargetDir)PDFNet /S /I /Y

    PostBuilEvent

  5. In your Visual Studio project add Lib/PDFNetLoader.dll as a reference.
  6. In your Visual Studio project add Lib/PDFNet/x86/PDFNet.dll as a reference.

    Note: Here we use the 32bit version so the project will work on 64 or 32bit OS.

  7. Right click the newly added PDFNet.dll (not PDFNetLoader.dll) in your Visual Studio project References, and select Properties.
  8. Change the “Copy Local” property to “False”

    Props

  9. Somewhere in your code, as early as possible, and before any PDFNet methods are called, add the following line.
    private static pdftron.PDFNetLoader loader = pdftron.PDFNetLoader.Instance();
  10. Finally, add a call to Initialize PDFNet somewhere in your code that would be called after the line above.
    pdftron.PDFNet.Initialize();
You are now ready to run your project in AnyCPU.

The project references to PDFNet and PDFNetLoader allow you to code normally. At runtime though, the following will happen.

  1. The PDFNetLoader.Instance() method is triggered, which registers itself as an event handler for AppDomain.AssemblyResolve.
  2. JIT encounters a method that uses PDFNet and tries to load PDFNet.dll, but because of the “local copy:false” this fails.
  3. PDFNetLoader gets the AssemblyResolve event, detects if system is 32 or 64 bit, and loads the corresponding PDFNet.dll.

Note, if you want to put the PDFNet assemblies somewhere else, you can, and call PDFNetLoader.Path to set the directory.

Distribution to Your Server or Client

The above helps developers build and test using AnyCPU, but how do you distribute the resulting program, especially considering you need the VC++ Redistributable as described earlier.

If you have an installer for your program, then certainly this would be a great time to detect the OS architecture (x86 or x64), and install the corresponding PDFNet and VC++ Redistributables for that type. In this case there would be no need for the PDFNetLoader.dll assembly.

There are two ways to “install” the VC++ Redistributables. Either you can include the dlls that Microsoft provides in the redistributable and put them with your program binary so they are available to load at runtime. See this Microsoft article for more on possible locations. Or you can run the VC++ Redistributable installer itself, though this would probably require your installer to have elevated rights.

We don’t recommended trying to support x86/x64 runtime loading on the client machine, because you also need to load the correct (x86 or x64) VC++ Redistributables, and this cannot be done using the PDFNetLoader.dll because they are unmanaged assemblies. You can of course install both 32 or 64bit versions of the VC++ Redistributable if you really want to support both architectures. Then the OS will take care of loading the correct one for you.

Can I use PDFNet only in a x86 or x64 Application?

Of course! For example the x86 version is able to run on either 32 or 64bit machines.

Simply add PDFNet.dll to your Visual Studio project references (along with the rest of your referenced assemblies), and make sure that copy local: true is set (the default), and you are ready to develop.

Note, if you just want to use 32bit PDFNet, for both 32 or 64 bit OS, you should know that PDFNet is built LargeAddressAware, but that this is not available by default in a .Net binary. If you execute the following line on your exe as part of your post build process

editbin /LARGEADDRESSAWARE 'your exe' 

then your 32bit process (and PDFNet) will be able to use up to 3GB of memory on a 32bit OS, and 4GB on a 64bit OS. This should be enough to open any PDF you encounter.

Distribution to Your Server or Client

This step is certainly simpler then what is described earlier, but of course you still need to make sure that the correct VC++ Redistributable is available or installed, see above for details.

Further Reading

We hope our .Net users find this useful, and as always you can ask questions in our Forum or on Stackoverflow.


PDF Day in Washington DC and New York City

$
0
0

PDFTron would like to invite you to join us at PDF Day. Hosted in Washington, DC on December 10, 2014 and New York City on December 11, 2014, the event will provide CIOs, IT executives, content strategists and document management vendors the big picture on PDF technology – not sales pitches – from top developers in the space.

We are a volunteer organizer and sponsor of the event, and will present a session ‘Collaborating with PDF’. We will explore some of the challenges and benefits of enabling secure, browser-based collaboration on documents, including PDF viewing, annotation, and other key features that allow users and groups to work collaboratively on the same PDF at the same time. This approach is presented as an efficient alternative to the conventional method of editing and sending email attachments. The session will be based on the expertise we’ve gained from the development and support of our WebViewer SDK, which enables universal document viewing, annotation, collaboration and processing on the web and across platforms.

Sound Interesting? You can find all the details on the event and other sessions using the links below:

Washington, DC (December 10th): http://www.pdfa.org/event/pdf-day-in-washington-dc-december-10/
New York City (December 11th): http://www.pdfa.org/event/2014-december-11-is-pdf-day-in-new-york-city/

Promo code: As a sponsor we’d like to invite you to register with our 50% promo code AT-PDF-DAY-THANKS-TO-PDFTRON


Collaborating with PDF

$
0
0

PDFTron was pleased to present at the PDF Association‘s recently held PDFDay conference in Washington, DC and New York City. James Borthwick, a member of our development team, presented a talk on Collaborating with PDF: Where we are today, and what’s next. It is now available online:

The talk addresses the following topics on collaborating with PDF:

  • What is PDF Collaboration?
  • How do we collaborate today (hint: this hasn’t changed in 15 years)
  • How can we radically improve PDF collaboration?

It then concludes with an audience Q & A.

Some previous blog posts related to this topic include A PDF Viewer in an HTML5 app and Document Collaboration with PDFNet.

We also presented a brief outline some of the capabilities of PDFTron’s software, which you can see here in just over 3 minutes.

We cover the following, with details of course available on our website.

  • PDF/A Validation and Conversoin
  • Mobile & Desktop viewing, editing and annotating
  • Web: viewing, annotating and collaborating
  • Conversion to and from ePub, HTML, XPS, SVG, etc.
  • Redaction
  • Optimization
  • Forms
  • Digital Signatures

Cross-Platform Word to PDF Conversion

$
0
0

View and Convert Microsoft Word Documents Anywhere

We’re very pleased to announce the launch of the newest addition to PDFNet SDK: built-in Word conversion.  Now you can go straight from .docx to .pdf,  free from the shackles of Microsoft Word or any other 3rd party software.  Conversions are accurate and fast; they also work on any platform supported by PDFNet SDK  (and there are a lot of them! see the SDK download page for more details).

docx. It took a long time to get the text to flow around shapes correctly in the docx engine.

Dependency-free Word conversion enables a couple of great use cases: you can  perform reliable conversions in a server environment, or pair it with our PDF Viewer for seamless viewing of .docx files on Android, iOS, and Windows Phone/RT.

Easy to Use, but Still Flexible

We’ve tried to strike a balance between power and simplicity in the API. The best way to demonstrate this is through a quick example: this small snippet demonstrates how to convert a Word document to a PDF (in Java).

// Start with a PDFDoc (the conversion destination)
PDFDoc pdfdoc = new PDFDoc();
// perform the conversion with no optional parameters
Convert.wordToPdf(pdfdoc, "input_file.docx", null);
// save the result
pdfdoc.save("output_file.pdf", SDFDoc.e_remove_unused, null);

That’s it, just 2.5 lines of code. Of course, maybe you would rather have more control over the conversion process. That’s possible too: the interface allows for cancellation, progress reporting, page-by-page conversion, and diagnostic messages (for example, information on font substitutions). Here is the same conversion, performed page-by-page and with progress reporting.

// get a DocumentConversion object, which encapsulates and controls
// the conversion process
DocumentConversion conversion = Convert.wordToPdfConversion(
    pdfdoc, "input_file.docx", options);
// convert each page, one-by-one, with progress reporting
while(conversion.getConversionStatus() == DocumentConversion.e_incomplete)
{
    conversion.convertNextPage();
    System.out.println("Progress: " + (conversion.getProgress()*100.0) + "%");
}
// save the result
pdfdoc.save("output_file.pdf", SDFDoc.e_linearized, null);

To see these snippets as part of a fully working application, take a look at the WordToPDFTest sample project in the PDFNet SDK trial package, available here.

No Fonts? No Problem

While fonts can be embedded within .docx documents, they typically aren’t. On a typical Windows system this isn’t a problem: the most common fonts in Word documents (Calibri, Times New Roman, Arial, Cambria, etc. ) are installed by default on every Windows system, and they can be used while converting or viewing the document.

On other systems, such as Linux servers or Android phones, these fonts are only available in special circumstances, and without them you would normally be limited to two options: a) distributing the original fonts alongside your app, or b) settling for poor conversion results.

With the PDFNet SDK, this is no longer the case: we employ a number of strategies to ensure that conversion remains faithful to the original — with content in the right place and on the right page — even when supplied with no external fonts at all. For a more practical and in-depth look at font-handling, see this  knowledge base article.

Good, and Getting Better

We put a lot of work into our .docx converter. We’re very proud of this product, and we’re committed to improving it.

For the vast majority of documents created in a recent version of Word (Word 2010 or Word 2013, for example), the converter will yield excellent results, often indistinguishable from Word itself. Unfortunately, the .docx format is extensive and underspecified: the specification is more than 5000 pages, and is riddled with omissions and exceptions. There are bound to be features or behaviours that we have not quite nailed down. But that’s ok! We’re a small dev team and we move fast. If you’ve got a use case in mind, download our SDK here and try it out. If something isn’t working for you, then let us know, and chances are we’ll get it cleared up right away.

What About Powerpoint and Excel?

Work continues — Powerpoint to PDF and Excel to PDF are both on the way. We’ve accumulated a lot of great underlying technology while making the Word engine, and we plan to put that tech to use as we tackle the .pptx and .xlsx formats over the next year.

Give it a Try!

The built-in .docx conversion module is available as part of the PDFNet SDK for Windows, Linux, Mac, Android, iOS, and Windows Phone/RT. To obtain a free trial, visit our downloads page. Interfaces to the module are available in   C#, Java, Objective-C, Visual Basic, Python, Ruby, PHP,  C, and C++ (subject to platform availability: there is no PHP available on Android, for example).

The SDK download contains fully functional sample applications which demonstrate how to use the converter. These samples are available in C#, Java, Objective-C, C++, and Visual Basic. Look for the WordToPDFTest directory under Samples in the SDK package.

Want More Information?

If you have technical questions or would like information regarding licensing, please contact us. Your inquiry will be directed to a developer or our sales team, as appropriate.


PDFTron at the PDF Technical Conference 2015

$
0
0

PDFTron is pleased to announce that we are a sponsor and presenter at the upcoming PDF Technical Conference 2015, held October 19-20 in San Jose, California. Aimed at software developers and technical product managers encountering PDF technologies in their work, the event will consist of educational and sponsored sessions presented by experts in the PDF field.

PDFTC15_488x200-qAs a sponsor and presenter at the event, we will be holding two sessions: “Introducing PDFNetJS: the first complete PDF toolkit for the browser” on October 19, and “Semantic content recognition in PDF, and what’s next” on October 20. PDFTron’s CTO, Ivan Nincic, will also be part of the conference’s closing panel discussion on “PDF as a Platform – the Challenge and the Prize”. In our presentations, we will be introducing our new web-based technology, PDFNetJS, the first complete PDF toolkit for the web enabling client-side JavaScript PDF processing, including PDF viewing, annotation, form-filling, file conversion, and more! In our educational session, we’ll be discussing semantic content recognition in PDF and its uses such as in accessibility support and reflow, and what’s next. In addition to PDFTron’s sessions, the conference will cover a wide variety of subjects, including mobile devices, digital signatures, metadata, what’s coming in PDF 2.0, and more.

Interested? Find more information on the conference and view the complete conference schedule.


Introducing PDFNetJS: A Complete Browser-Side PDF Viewer and Editor

$
0
0

PDFNetJS

The WEB is taking over (obviously)

On desktop computers, web apps continue to replace activities that were previously fulfilled by Windows/Mac/Linux programs. The advantages are many: web apps are immediately available on every connected computer; the user doesn’t need to download something; they instantly update and they’re cross-platform. That they naturally lend themselves to a subscription model is yet another reason that companies are choosing to develop web apps in favor of a traditional desktop program.

However, web apps have historically had a number of shortcomings. An inability to deal with local files (without long uploads). Multimedia required securitychallenged plugins. And they couldn’t display PDF files.

How do web apps deal with PDFs today?

Web apps typically deal with PDFs in one of two ways, either as a download (e.g. for archiving), or render it in-app using a web-technology.

download html5

Of course, PDF is not a “web technology”—is not part of or referenced by the HTML/CSS/JS specifications, and can’t be naturally displayed within a web page. So to display a PDF using web technologies, web apps rely on a server to convert the PDF to a web technology, such as HTML, PNG, or SVG.

This is an acceptable solution for some use cases, and one that we’ve offered to our customers for a number of years. (We also developed what in our opinion is the optimal conversion solution, which converts PDFs and other documents such as Office to a web-optimized XPS file format—see the WebViewer product page for more info.) However all server conversion based solutions have a number of drawbacks which can’t be overlooked.

Why server conversion is a problem

Server conversion has to date been the best workaround for the fact that web browsers can’t display PDFs. These are the main shortcomings of server conversions:

  • Server conversion adds delays, which can be long.

    There are three things that the user needs to wait for: uploading the PDF to the server, waiting for the server to convert the PDF to HTML/SVG/PNG and then downloading the converted version. Nobody likes to wait.

  • Servers cost money and create hassles.

    Application servers are costly to run and maintain. They also require additional developer time and expertise.

  • Anything that runs on a server will need to be scaled with usage, which is not true with client-side execution.

How can we do better?

At PDFTron, we knew that our customers would love a solution that didn’t require converting PDFs on a server. But how would that be possible?  Could we make PDF a first-class web citizen, that is a file format that a web browser could independently display?

We investigated, and it didn’t take long to narrow in on the HTML5 <canvas> element, which as it happens is used as the basis for the Mozilla pdf.js project.

What is <canvas>?

Here’s the description of the canvas element, as described by Mozilla:

canvas

Canvas allows one to use javascript APIs to draw shapes, gradients, blends, images, text – all of the component pieces that make up a PDF page. So, the question is if it can draw the component parts of a PDF, can it be used to draw a PDF?

How to use <canvas> to render PDF

The idea is to write a javascript program that parses and understands the PDF file, and uses the canvas to render all the elements to the page.

For example, here is the PDF code that draws a red circle.

1 0 0 RG
0 0 0 0 k
7 w
376.868 580.836 m
376.868 580.836 l
376.868 627.323 340.405 665.008 295.425 665.008 c
&amp;amp;gt;250.445 665.008 213.982 627.323 213.982 580.836 c
213.982 534.349 250.445 496.664 295.425 496.664 c
340.405 496.664 376.868 534.349 376.868 580.836 c
h
S

The javascript program would ingest the PDF file and emit javascript that draws a red circle using the canvas:

<canvas id="myCanvas" width="612" height="792"></canvas>

var canvas = document.getElementById('myCanvas');
var context = canvas.getContext('2d');
var centerX = canvas.width / 2;
var centerY = canvas.height / 2;
var radius = 70;
context.beginPath();
context.arc(centerX, centerY, radius, 0, 2 * Math.PI, false);
context.fillStyle = 'transparent';
context.fill();
context.lineWidth = 7;
context.strokeStyle = 'red';
context.stroke();

The program would would use the same principle to draw all PDF elements, and hence render the entire PDF. Sound great, but unfortunately there are problems.

Why <canvas> is inappropriate for PDF documents

While in theory using canvas sounds like a good idea, there are actually a number of problems that make it unsuitable for use as a PDF renderer.

The first problem is that all rendering commands are issued in the UI thread, meaning complex pages either freeze the browser, or require constant calls to setTimeout() which slows down rendering.

The second issue is that canvas is hardware accelerated, which makes some rendering operations faster, but also unreliable. The problem is that by shifting responsibility for drawing to hardware, it relies on drivers, which are sometimes buggy. Because of this, the creator of the PDF renderer cannot guarantee it will render PDFs correctly, because much of the critical rendering code is outside of their control.

Lastly, <canvas> and PDF have different graphics models. Because PDFs contain non-image data, it means that the renderer (ie <canvas>) needs to know how to interpret non-image information and render it into an image. If the renderer doesn’t understand certain constructs, then it cannot render the page correctly. Unfortunately, the canvas model does not have perfect overlap with the PDF model, and so the PDF can contain information that is impossible for a canvas to draw.

An example of this can be taken directly from the PDF specification.

Here is a PDF snippet that draws some text that has a gradient fill, where the text and fill have different transforms.

BT
/Pattern cs /P0 scn
/GS0 gs
/TT0 1 Tf
30 0 0 30 181.5 494 Tm
(ABCDEFGHIJKLM)Tj
ET

This is what it looks like when it’s rendered:

filled-text

However, in canvas, text and its fill cannot have different transforms. So what happens when this is put to canvas? As pdf.js uses canvas, we can look to it as an example. In this case, it outputs nothing, just a blank space where the text should be. Below is the page in question, rendered correctly on the left, and by pdf.js on the right.
Pages

What makes PDF, PDF?

It’s worth pausing at this point to consider, what makes a PDF a PDF? Why do people use PDF as a file format? We think that a recently posted article on the PDF Association website put it very well:

Truth #2. It’s about a completely reliable experience

The PDF model begins with at least one page, usually including some text or images. PDF allows for many other features like digital signatures, encryption, attached files, metadata, and semantic information (tags) associated with that page, but the format’s core value is based on its ability to reliably represent the document author’s intent in all respects.

The primary value proposition of PDF is that it looks the same everywhere. This reliability is a cornerstone of the PDF format, so if a PDF renderer only renders some PDFs correctly, or parts of a PDF correctly, is it really a PDF renderer? At PDFTron, the thought of shipping an unreliable renderer is totally unacceptable, and something that we could not do. So <canvas> was out.

But we still wanted to provide web-based PDF rendering. So what could we do?

PDFNetJS

PDFNetJS

The answer, as we’ve discovered, does not lie in the latest and greatest web technology. The answer is actually old, using the same method as reliable desktop renderers. Reliable desktop renderers don’t use OS-provided graphics libraries like Direct2D or Quartz, but implement their own rendering internally. This way it is completely controlled by the renderer, and so correct rendering can be guaranteed. And this is in fact exactly what we’ve done with PDFNetJS. With PDFNetJS, the rendering is completely controlled by us, and is completely reliable.

More than that, PDFNetJS is a actually complete PDF toolkit. It provides the same extensive functionality as our desktop SDKs, because it is the same SDK, compiled for the web.

PDFNetJS has, among others, the following capabilities:

  • rock-solid in-browser rendering
  • annotate PDFs and fill out forms
  • generate PDFs
  • split/merge
  • reorder and organize pages
  • redact pages
  • extract text
  • convert PDF
  • PDF/A conversion/validation
  • encryption, decryption
  • optimization

PDFNetJS: No Server Component

no server

Not relying on a server component for in-browser PDF viewing provides a number of substantial advantages, for both the programmer and in the quality of the finished app. From our perspective, these are the top advantages.

 With PDFNetJS:

1. Apps are easier to develop

Skipping a server component means a number of things from a developer’s point of view. First, there is no need to set up a PDF server stack, so one substantial task is eliminated there.

Applications are easier to write, because complexities involved in spreading the PDF logic between client and server are eliminated. Similarly, asynchronous client-server communication is avoided, which is often a source of bugs, and ones which are often hard to reproduce and hence to fix. Having all the code in one place makes programs easier to write, and easier to write without bugs.

With server conversion, you need to keep duplicate representation of content, which may or may not be ready or need updating. This is a form of caching, which we all know is one of the two hard things in computer science, and a frequent source of bugs and frustration. With PDFNetJS, there is no server, and no caching.

2. Apps are more responsive

With a server, a person needs to wait for their PDF to upload and be converted before they can do anything with it. This can be slow and tedious, and so undesirable. With PDFNetJS, local files can be viewed and worked on immediately, without an upload or conversion step. Cloud files can be instantly saved locally after annotating or modifying, without a 2nd download step.

3. Apps are more reliable

A server is typically responsible for many, many end-users. If there’s a bug in your PDF processing logic that crashes or freezes your server, it has the potential to cause a major system-wide disruption. By placing all of the PDF processing in each user’s browser, they are all naturally isolated, so a problem encountered by one user will have no effect on others.

A second benefit is that as PDFs are a type of user input, placing their processing on the client side provides a large security benefit. The web server never needs to process the PDF – even if it’s stored online, it can be treated as a binary blob and never opened, except on the user’s computer. From the user’s point of view, the PDF may not need to ever hit the server, providing them with privacy.

4. Apps are more scalable

PDF processing can be CPU intensive, so if you convert PDFs on a server, it means the more users you have, the more PDFs you’ll need to process and the more servers you’ll need to pay for and maintain. With PDFNetJS, all of the processing is done client side, which is as scalable as the plugin model: infinite, and free.

Code

So what is it like to use in code? Pretty simple.

Viewing a PDF

<script>
var container = document.createElement("div");
var webViewer = new PDFTron.WebViewer({
    initialDoc : "WebViewer_Dev_Guide.pdf"
}, container );
</script>

This presents a PDF viewer as shown in our online demo. It can be used as-is, or of course customized for both functionality and look/feel, to fit the requirements of your application.

Merging PDFs

This demonstrates how to merge two PDFs. If you’re familiar with the PDFNet API, it will look immediately familiar, as the API is the same as on our other platforms.

var combined_doc = yield exports.PDFNet.PDFDoc.create();
var doc1 = yield exports.PDFNet.PDFDoc.createFromURL("Doc1.pdf");
var doc2 = yield exports.PDFNet.PDFDoc.createFromURL("Doc2.pdf");
var doc1PageCount = yield doc1.getPageCount();
combined_doc.insertPages(1, doc1, 1, doc1PageCount);
var doc2PageCount = yield doc2.getPageCount();
combined_doc.insertPages(doc1PageCount+1, doc2, 1, doc2PageCount);

// Save our newly merged document
var docbuf = yield combined_doc.saveMemoryBuffer(exports.PDFNet.SDFDoc.SaveOptions.e_linearized);
saveBufferAsPDFDoc(docbuf, "mergedDocument.pdf");

You may be wondering what all the yield statements are. These allow a special type of function (known as generator functions) to be conveniently run and resumed in a style that looks like regular code. This is required because PDFNet uses web workers to execute on a background thread, which keeps the UI responsive.

Performance

The performance of PDFNetJS is also very good. We’ve found that in most cases it’s as fast or faster than pdf.js.

fast

When compared to native solutions, it’s about 95% as fast on Chrome, and about 50% as fast on other modern browsers. For most PDFs, the speed is sufficient that most users won’t notice any speed difference between PDFNetJS and a native app.

Showcase

After reading this we hope you’re itching to try it out. We have an online showcase available here:

http://pdftron.com/webviewer/pdfnetjs/samples.html

Conclusions

PDFNetJS is an exciting and unique offering in the PDF world, and one that changes what is possible for web app developers to achieve.

In contrast with server based conversion, PDFNetJS offers snappy performance and decreased development costs. In contrast with pdf.js, it offers correct rendering and a wide variety of PDF creation/modification APIs.

So please, download the SDK, give it a try, and tell us what you think.


pdf.js: Interesting Project, Incorrect Rendering

$
0
0

pdf.js is a well known project for rendering PDF documents directly in the browser. In that sense, it is similar to our recently announced PDFNetJS. While pdf.js is interesting project, and may be a reasonable choice in some very specific situations, it has a number of serious problems that make it unreliable for any situation where PDF rendering is important.

pdf.js vs PDFNetJS: Different Approaches

pdf.js was initially conceived by Mozilla (creators of Firefox) as a project to show off HTML5, how it could be used to implement almost anything that a native app could do. PDF rendering was chosen because it was the domain of native apps, and that as PDFs are composed of text, images and vector graphics, a web browser should be able to lean on its existing native capabilities to handle these data types to render them without too much difficulty.

At first it might sound reasonably straightforward, but it didn’t take long for the authors to run into trouble. PDF text was problematic, because PDFs contain many different font types, most of which are not supported by browsers. Images were trouble too, because PDFs use compression formats (CCITT, JBIG2, JPEG 2000, and more) that were not browser-supported. Even the vector graphics were not compatible, with concepts such dashed lines, certain polygon fill rules and a number of blending operations not present and which could not be added without being exceedingly slow.

To solve a number of these problems, Mozilla implement the missing functionality using native code directly in Firefox, exposed as custom “HTML” extensions. Ironically this reliance on Firefox-specific behaviour made pdf.js useful only on Firefox, and not truly an HTML5 program, so Mozilla began to lobby the other major browser developers to add these extensions to their browsers too. The lobbying has had limited success, with the situation now being that Firefox has the custom extensions that pdf.js expects (of course), Chrome and Safari having some, and Internet Explorer and Edge having few.

The result of this is that pdf.js renders quite differently when used on different browsers, as shown below.

image05
pdf.js rendering in FireFox

image02pdf.js rendering in Chrome

image03pdf.js rendering in IE

PDFNetJS on the other hand avoids these problems entirely. It does so by not relying the browser’s built-in image, font or vector handling, including the new extensions. By instead handling all of these requirements internally, there are no rendering differences between browsers, and rendering is high-quality and guaranteed.

Here is how PDFNetJS and desktop apps render the PDF snippet shown above, which you will note is actually different than pdf.js:

image07Correct Rendering (PDFNetJS)

Partial PDF Implementation

Despite adding some missing functionality into the browser itself, Mozilla did not add everything that PDF needs, and its authors have no plans to do so. At a presentation on pdf.js, it was said that “I don’t think we’ll ever add support for some of the [patterns and gradients that] PDF supports, because they’re really kinda bizarre” (20:27). Of course supporting these patterns and gradients are required to correctly render PDFs that use them, and if support is missing, the PDF will be rendered incorrectly, if at all.

Excluding parts of the PDF specification is very troubling, as the core promise of a PDF is that it reproduces content in an identical manner. The fact that the pdf.js project is comfortable producing a renderer that intentionally does not fulfill this promise runs against the very essence of PDF, and undermines the trust users put in the format. 

Beyond patterns and gradients, other parts of the PDF specification are also absent from pdf.js. Some examples of missing or incomplete features are are soft masks (transparency), overprint, spot colours, and knockout groups, to name a few.

PDFNetJS on the other hand implements everything needed to render a PDF correctly, just as it is rendered on a desktop. It’s rendering is identical to that of our desktop SDKs, and is so on all browsers.

Below are two more examples of PDFs that do not render correctly in pdf.js (and of course do in PDFNetJS):

 

simple-doc

PDFNetJS Rendering (Correct)                                  pdf.js Rendering (Incorrect)

 

car-ad

PDFNetJS Rendering (Correct)                                   pdf.js Rendering (Incorrect)

Forms, annotations and PDF editing

PDFs course support far more than static viewing. One of PDF’s most used facilities is its ability to act as an electronic form, such as an income tax or passport form. Another feature that is widely used are annotations, which people use to to mark up PDFs with highlights, notes and other annotations for others or for later reference.

As useful as the features are, pdf.js does not support them. It does not have any facility to fill forms or add or manipulate annotations. PDFNetJS on the other hand comes with both of these built-in: it can fill forms (including process Javascript) and add/edit annotations.

Editing PDFs at a level below forms and annotations is also a requirement for many use cases. Document manipulation, such as split/merge, adding new pages, removing pages, rotating pages, and manipulating existing content on an element-by-element basis are useful abilities that PDFNetJS supports. This even extends to creating new, sophisticated PDFs from scratch. However, as with forms and annotations, pdf.js does not support changing PDF content in any way. 

Security

Although one of the motivations for using pdf.js was to enhance security, it has actually been responsible for enabling a number of security bugs.

Support

Support is an important part of any software offering. pdf.js is an open source project that Mozilla has created for Firefox’s own use. This means bugs and problems, if they decide are worth fixing, are resolved on their schedule. Unfortunately this means many bugs go months or even years without being addressed. For example, this rendering bug, which is reportedly a regression, was reported in February 2015, is still not fixed, and has not been assigned to anybody or added to a milestone. This rendering bug has been outstanding since March 2013. This rendering bug has been open since September 2012. The list goes on. Keep in mind these rendering bugs will only be a small handful of those that exist as most Firefox users are probably not reporting bugs on Github.

In contrast, here at PDFTron the customer is our priority. If you need help regarding how to use PDFNet or have questions about how to use it most effectively, we are here to offer advice. Should you find a bug, chances are we will be able to provide a patched version in short order. We offer developer-to-developer technical support, ensuring accurate and timely answers to your questions or concerns.

The Future

Support is not only important for the present, but also for the future. The web is changing at a rapid pace, so maintenance for compatibility with new versions of browsers is important.  The future of the pdf.js project is somewhat questionable. Its key proponents, Andreas Gal and Chris Jones, have both left Mozilla. Mozilla Labs, which hosted the pdf.js project, was shut down. Active contributors to the GitHub project have petered down to two or three, and it’s unclear if they are employed to work on pdf.js (and if so, if it’s full time), or if they are community volunteers. Either way, pdf.js’ champions have moved on, and it would appear so has Mozilla.

At PDFTron, PDF SDKs are our reason for being. PDF is not a side project, it is what we’ve been concentrating on for over a decade. As the software industry continues to evolve, we will advance our offerings to ensure not only are they compatible with the latest technologies but so that they are taking full advantage of them. This commitment is why we are able to offer a multi-platform PDF solution spanning Web/Windows/Linux/Mac/Android/iOS in multiple languages. With PDFTron you have a partner who is wholly committed to PDFs, now and for years from now.

Conclusions

pdf.js is certainly the most high profile browser-based PDF renderer. However it is important to understand that the renderer knowingly omits parts of the PDF specification and so displays many PDFs incorrectly. Mozilla has always advocated for the open web, so it is strange that the pdf.js project would be designed to rely on non-standardized behaviour (plus implement only part of the PDF standard at that). Contentment with inconsistent and incorrect PDF rendering harms the entire ecosystem because if low-quality renderers such as pdf.js proliferate, the unreliable rendering will cause individuals to lose trust in the PDF format itself.

Beyond PDF rendering, pdf.js has additional deficiencies: the project is slow to fix bugs, offers little in the way of support and does not support many important PDF features such as form filling and annotations. As for its future, there is little indication that any of the above will change. Pdf.js appears to be losing momentum as Mozilla’s priorities shift away from PDF rendering back to priorities more closely aligned with their core mission.

PDFNetJS does not suffer from any of the above problems. Rendering is absolutely reliable and is consistent across browsers. We are wholly committed to the customer and can fix any bugs in short order. We are also here for PDF for the long haul, so as with the rest of our SDKs, you can count on PDFNetJS being updated with new capabilities and to take advantage of whatever changes web browsers bring next.

Please check out PDFNetJS – we think you’ll like what you find.


Semantic Content Recognition in PDF

$
0
0

Semantic content recognition is the ability to identify components of a document by their “class” – that is if any particular content constitutes a title, subtitle, section, paragraph, word, figure, caption, table, etc. This is a problem, that despite decades of research, remains open. Available solutions are unreliable and are far, far behind the ability of a human being.

At the 2015 PDF Technical Conference, PDFTron’s CTO gave a presentation addressing the problem of semantic content recognition in PDF. The presentation gives an overview of the problem itself, why it has been such a hard problem to solve, and how the industry as a whole might organize itself to finally develop solutions that perform with the same accuracy as a person.



Getting Started with PDFNet for iOS

$
0
0

This Getting Started document is for users of the PDFNet dynamic framework, version 6.7 and greater. For users of the static library, or earlier version of PDFNet, please refer to getting started with the static framework.

Introduction

This short tutorial will guide you through creating an app that can show and annotate a PDF. A completed project can be found on our GitHub repository. You should use the latest versions of the PDFNet Framework and the tools source code, available by request on our website. The tutorial is divided into four parts:

  • Part 1: Showing a PDF.
  • Part 2: Adding support for text selection, annotation creation and editing, link following and form filling.
  • Part 3: Adding support for encrypted PDF documents.
  • Part 4: Next Steps

(Note that PDFNet for iOS includes the sample project “PDFViewCtrl” and “Complete Reader”, which implement the features contained in this sample plus more.)

Part 1: Showing a PDF

1. Create a new app

Open Xcode 8 or greater and create a new iOS Project, choosing “Single View Application” from the list of available templates. After clicking “next”, name the project PTTest. Save the new project at the location of your choosing.

2. Prepare the project

The first thing is to change the name of AppDelegate.m to AppDelegate.mm. This changes the file to an Objective-C++ file, and will ensure that the C++ standard library that is required by PDFNet is included at link time. (The project should have at least one .mm or .cpp file; which file is not important. Alternatively, open the project settings, select Build Phases, and under “Link Binary With Libraries”, add “libc++.tbd”.)

Secondly, in the project’s Build Options, set Enable Bitcode to No. (A bitcode version of the library can be made available to licensed customers.)

Lastly, in your target’s Build Phases panel, add a new “New Run Script Phase”, and add the following script:


Bash "$BUILT_PRODUCTS_DIR/$FRAMEWORKS_FOLDER_PATH/PDFNet.framework/strip-framework.sh

This will ensure the simulator slices are striped from the framework before being submitted to the app store.

3. Add the framework and a PDF to the project

Add the framework: In your project’s settings general tab, scroll to the “Embedded Binaries” section, click the ‘+’, and select PDFNet.framework from your filesystem.

Add a PDF: For the purposes of this tutorial, we will use “mech.pdf”, which is included in the TestFiles folder. Add it by dragging it to your project’s other files in Xcode. You can use any PDF you want, just replace “mech” with the name of your PDF in the code snippets that follow.

4. Add code to show a PDF

Change ViewController.m to the following. The changes are the additional #import statement, and the code in the viewDidLoad selector.

#import "ViewController.h"
#import &amp;lt;PDFNet/PDFNet.h&amp;gt;
#import &amp;lt;Tools/Tools.h&amp;gt;

@interface ViewController ()

@end

@implementation ViewController

- (void)viewDidLoad
{
 [super viewDidLoad];
 // Do any additional setup after loading the view, typically from a nib.

 // Initilize PDFNet (in demo mode - pages will be watermarked)
 [PTPDFNet Initialize:@""];

 // Get the path to document in the app bundle.
 NSString* fullPath = [[NSBundle mainBundle] pathForResource:@"mech" ofType:@"pdf"];

 // Initialize a new PDFDoc with the path to the file
 PTPDFDoc* docToOpen = [[PTPDFDoc alloc] initWithFilepath:fullPath];

 // Create a new PDFViewCtrl that is the size of the entire screen
 PTPDFViewCtrl* pdfViewCtrl = [[PTPDFViewCtrl alloc] init];

 // Set the document to display
 [pdfViewCtrl SetDoc:docToOpen];

 // Add the PDFViewCtrl to the root view
 [self.view addSubview:pdfViewCtrl];

 // set size of PDFViewCtrl
 [pdfViewCtrl setTranslatesAutoresizingMaskIntoConstraints:NO];

 [NSLayoutConstraint activateConstraints:[NSLayoutConstraint constraintsWithVisualFormat:@"H:|[pdfViewCtrl]|" options:0 metrics:nil views:NSDictionaryOfVariableBindings(pdfViewCtrl)]];

 [NSLayoutConstraint activateConstraints:[NSLayoutConstraint constraintsWithVisualFormat:@"V:|[pdfViewCtrl]|" options:0 metrics:nil views:NSDictionaryOfVariableBindings(pdfViewCtrl)]];

}

- (void)didReceiveMemoryWarning
{
    [super didReceiveMemoryWarning];
    // Dispose of any resources that can be recreated.
}

@end

5. Run the app

You can now run the app. If you run in the simulator, you will see the following. Note that the PDF can be scrolled and zoomed.

iPad-Simulator

When zooming, you might notice that the gray area behind the pages does not match the white background of the root view. You can fix this by adding the following code at the end of the viewDidLoad selector.

	// Makes the background light gray
	[self.view setBackgroundColor:[UIColor lightGrayColor]];

	// sets the non-page content of the PDFViewCtrl to transparent
	[pdfViewCtrl SetBackgroundColor:255 g:0 b:0 a:128];

Part 2: Adding support for Annotations, Text Selection and Form Filling

PDFNet comes with built-in support for text selection, interactive annotation creation and editing, form filling and link following. These features have been implemented in an open source project using the PDFNet API, and are included as a  project that builds the dynamic library Tools.framework. Because the source is provided, implementers have complete flexibility and control to customize how users interact with the PDF so that it can fit their requirements exactly. To add support for annotations, text selection, etc:

  1. Add the framework: In your project’s settings general tab, scroll to the “Embedded Binaries” section, click the ‘+’, and select Tools.framework from your filesystem.
  2. Add #import <Tools/Tools.h> at the top of ViewController.m
  3. Add the following lines as the last lines of the viewDidLoad selector in ViewController.m
	// creates a new tool manager using the designated initializer
	ToolManager* toolManager = [[ToolManager alloc] initWithPDFViewCtrl:pdfViewCtrl];

	// registers the tool manager to receive events
	[pdfViewCtrl setToolDelegate:toolManager];

	// sets the initial tool
	[toolManager changeTool:[PanTool class]];

You are now ready to run the project again. Now, when you run the project, you can select text, follow links and create and edit annotation. To create a new annotation, long press on an area of the document to trigger a popup with annotation types to create. This example behavior is shown in the blow screenshot.

annotations

Part 3: Opening encrypted documents.

PDFNet supports opening encrypted PDF documents. To open an encrypted document, you need to do is initialize a PDFDoc’s security handler with the correct password. Add the following code snippet after creating the PDFDoc in order to display an encrypted PDF.

if( [docToOpen InitStdSecurityHandlerWithPassword:@"password-string" password_sz:0] == NO )
{
  NSLog("Password is incorrect");
  return;
}

Of course a “real” app would require that the password be obtained from the user, which is implemented in the sample viewer that is included with the PDFNet for iOS download.

Part 4: Next Steps

This concludes our introductory PDFNet for iOS Tutorial. The completed tutorial project can be downloaded from GitHub. For more help, please see the online documentation, sample code, and other tutorials.


Getting Started with Cross-Platform PDF Processing Using Xamarin.iOS and PDFNet SDK

$
0
0

This Getting Started document is for users of the PDFNet version 6.7 and greater. For users of earlier version of PDFNet, please refer to Getting Started (2014).

Introduction

This tutorial shows the minimum steps needed to add a PDF viewing and annotating component to a Xamarin.iOS app using PDFNet SDK. In this tutorial, you will create a simple PDF viewing and annotating app. You will also learn about an iOS Objective-C Bindings Library Project that allows you to customize our Tools library.

Note that the completed sample project described in Part 1-3 is available by request from here.

The tutorial is divided into 4 parts:

  • Part 1: Showing a PDF
  • Part 2: Adding support for Annotations, Text Selection and Form Filling
  • Part 3: Create customized Tools.dll from the open source Tools library
  • Part 4: Next steps

Part 1: Showing a PDF

  • Create a new app

Open Xamarin Studio or Visual Studio and create a new Xamarin.iOS project. If this is your first Xamarin.iOS project, check out the getting started with Xamarin.iOS guide here: https://developer.xamarin.com/guides/ios/getting_started/hello,_iOS/

  • Add required dependency and a PDF file to the project

Add the PDFNetiOS.dll to the References list. Add a sample file “sample.pdf” to the Resources list. Set the Build Action of the sample file to BundleResource.

  • Add code to show a PDF

Change ViewController.cs to the following.

using System;
using System.IO;
using CoreGraphics;
using Foundation;
using UIKit;
using CoreAnimation;
using ObjCRuntime;
using System.Collections.Generic;

using pdftron;
using pdftron.PDF;
using pdftron.PDF.Tools;
using pdftron.PDF.Controls;

namespace PDFNetiOSXamarinSample
{
public partial class PDFNetiOSXamarinSampleViewController : UIViewController
{

private PDFViewCtrl mPdfViewCtrl;
public PDFNetiOSXamarinSampleViewController () : base ("PDFNetiOSXamarinSampleViewController", null)
{
}

public override void ViewDidLoad ()
{
base.ViewDidLoad ();
// Do any additional setup after loading the view, typically from a nib.

try
{
// Initilize PDFNet (in demo mode - pages will be watermarked)
PDFNet.Initialize();

Console.WriteLine("version:" + PDFNet.GetVersion());
}
catch (pdftron.Common.PDFNetException e)
{
Console.WriteLine(e.GetMessage());
return;
}

View.Frame = UIScreen.MainScreen.Bounds;

// Create a new PDFViewCtrl that is the size of the entire screen
CGRect viewRect = new CGRect(0, 0, View.Frame.Size.Width, View.Frame.Size.Height);
mPdfViewCtrl = new PDFViewCtrl(viewRect);
mPdfViewCtrl.PagePresentationMode = PDFViewCtrl.PagePresentationModes.e_single_continuous;
mPdfViewCtrl.TranslatesAutoresizingMaskIntoConstraints = false;
mPdfViewCtrl.AutoresizingMask = UIViewAutoresizing.FlexibleWidth | UIViewAutoresizing.FlexibleHeight;

View.AddSubview(mPdfViewCtrl);

// Get the path to document in the app bundle.
string docPath = "sample.pdf";
// Initialize a new PDFDoc with the path to the file
PDFDoc docToOpen = new PDFDoc(docPath);

// Set initial document from disk
mPdfViewCtrl.Doc = docToOpen;
}

public override void DidReceiveMemoryWarning ()
{
base.DidReceiveMemoryWarning ();
// Release any cached data, images, etc that aren't in use.

mPdfViewCtrl.PurgeMemory();
}
}
}
  • Run the app

You can now run the app. If you run in the simulator, you will see the following. Note that the PDF can be scrolled and zoomed.

simulator-screen-shot-dec-2-2016-4-13-33-pm

Part 2: Adding support for Annotations, Text Selection and Form Filling

PDFNet comes with built-in support for text selection, interactive annotation creation and editing, form filling and link following. These features have been implemented in an open source project using the PDFNet API, and are included as a project that builds the Tools.dll. Because the source is provided, implementers have complete flexibility and control to customize how users interact with the PDF so that it can fit their requirements exactly. To add support for annotations, text selection, etc:

  • Add the dependency

Add the Tools.dll to the References list.

  • Add the following lines as the last lines of the ViewDidLoad method in ViewController.cs
ToolManager toolManager = new ToolManager(mPdfViewCtrl);
mPdfViewCtrl.ToolManager = toolManager;
toolManager.ChangeTool(typeof(pdftronprivate.PanTool));

You are now ready to run the project again. Now, when you run the project, you can select text, follow links and create and edit annotations. To create a new annotation, long press on an area of the document to trigger a popup with annotation types to create. This example behavior is shown in the below screenshot.

simulator-screen-shot-dec-2-2016-4-39-35-pm

Part 3: Create customized Tools.dll from the open source Tools library

In the previous section you have seen how to use PDFViewCtrl with the tools add-on. However, if you wish to customize the Tools behavior, you will need to do so using the open source Objective-C Tools project and then prepare the dynamic framework for building Tools.dll in Xamarin. This section demonstrates how to create Tools.dll from Tools.framework.

Please note that the following section will require some understanding on what a dynamic framework is and how to build one. As well as what a Xamarin binding project is and how to build one. Please refer to this article for more information.

You will need the PDFNetiOS package to proceed. It is available by request from here.

  • In the requested PDFNetiOS package, browse to /Lib/Tools/src/PDFViewCtrlTools. Open Tools.xcodeproj in Xcode and make the desired changes. Compile a fat dynamic framework Tools.framework.
  • Browse to /lib/ios/Native/Tools/binding. Open the provided binding project PDFViewCtrlTools.csproj in Xamarin Studio or Visual Studio. Remove the existing Tools reference under Native References. Click Add Native Reference and select the Tools.framework from the previous step.
  • Clean and build PDFViewCtrlTools project. A new Tools.dll will be created in /lib/ios/ folder. Use this new Tools.dll in your application. Alternatively, you can also include the PDFViewCtrlTools binding project in your application and make the binding project a project reference in your application project.

Part 4: Next steps

This concludes our introductory PDFNet for Xamarin.iOS Tutorial. The completed tutorial project is available by request from here. For more help, please see the sample code, and other tutorials. You can also browse our public forum for more information about PDFNet. For details related to technical support, please refer to PDFTron support page.


Getting Started with Cross-Platform PDF Processing Using Xamarin.Android and PDFNet SDK

$
0
0

This Getting Started document is for users of the PDFNet version 6.7 and greater. For users of earlier version of PDFNet, please refer to Getting Started (2014).

Introduction

This tutorial shows the minimum steps needed to add a PDF viewing and annotating component to a Xamarin.Android app using PDFNet SDK. In this tutorial, you will create a simple PDF viewing and annotating app. You will also learn about an Android Java Bindings Library Project that allows you to customize our Tools library.

Note the completed sample project described in Part 2-4 is available by request from here.

The tutorial is divided into 5 parts:

Part 1: Things you should know about the library distribution
Part 2: Showing a PDF
Part 3: Adding support for Annotations, Text Selection and Form Filling
Part 4: Create customized Tools.dll from the open source Tools library
Part 5: Next steps

Part 1: Things you should know about the library distribution

  • Full version vs. Standard version: The PDFNet for Xamarin.Android library is available in two different versions, the Standard version and the Full version. The Standard library offers the same viewing, annotation, and editing capabilities of the Full version, however it is much smaller. The main differences are:
    • The Full version has a built-in digital signature handler, which can be used by calling PDFDoc.AddStdSignatureHandler(); the Standard version does not have built-in signature handler. However, even with the Standard version, you will still be able to use the DigitalSignature tool because the pre-built Tools library includes Spongy Castle.
    • The Standard version does not support converting PDF pages to TIFF and PNG formats (i.e. PDFDraw will not work when using these formats.)
    • A number of features will only be available using the Full version due to their complexity. For example, universal document conversion (i.e. convert .docx, .doc, and .pptx to .pdf), reflow, and document preview cache generation. The detailed class list can be found in Release notes.
  • Best for app development vs. Best for app deployment: To provide a user-friendly framework for both development and deployment, PDFNet provides two options for using the PDFNetAndroid.dll. (To learn more about Android CPU architecture, please refer to this article.)
    • Best for app development: a single PDFNetAndroid.dll that contains arm64-v8a, armeabi-v7a, x86, and x86_64 architectures that can be used to create a single APK that works on all armeabi-v7a, arm64-v8a, x86 or x86_64 devices/emulators running Android 2.2 or greater. This .dll is located in Lib/Full and Lib/Standard.
    • Best for app deployment: five separate PDFNetAndroid.dlls for each architecture (arm64-v8a, armeabi, armeabi-v7a, x86, x86_64) that can be used to build five separate APKs. This allows an absolute minimum APK download size, and allows you to create APKs that work on armeabi, armeabi-v7a, arm64-v8a, x86 and x86_64 devices/emulators running Android 2.2 or greater. These .dlls are found in the /arm64-v8a, /armeabi, /armeabi-v7a, /x86 and /x86_64 folders respectively. (These five folders are themselves located in /Lib/Full and /Lib/Standard.)
  • Minumum Android API required: PDFNetAndroid.dll requires minimum API 11 (HONEYCOMB). Tools.dll requires minimum API 16 (JELLY_BEAN).

Part 2: Showing a PDF

  • Create a new app

Open Xamarin Studio or Visual Studio and create a new Xamarin.Android project. If this is your first Xamarin.Android project, check out the getting started with Xamarin.Android guide here: https://developer.xamarin.com/guides/android/getting_started/hello,android/

  • Add required dependency, resource and a PDF file to the project

Add the PDFNetAndroid.dll to the References list. Add a sample file “sample.pdf” to the Resources/raw  folder of your application. Set the Build Action of the sample file to AndroidResource. Find pdfnet.res file in resource/android folder and add this file to the Resources/raw folder of your application. Set the Build Action of the res file to AndroidResource.

  • Add code to show a PDF

Change MainActivity.cs to the following.

using System;
using System.IO;
using System.Collections;
using Android.App;
using Android.Content;
using Android.Runtime;
using Android.Views;
using Android.Widget;
using Android.OS;

using pdftron;
using pdftron.PDF;
using pdftron.PDF.Tools;
using pdftron.PDF.Controls;

using Android.Support.V4.App;

using com.xamarin.recipes.filepicker;
using pdftron.Common;

namespace PDFNetAndroidXamarinSample
{
[Activity(Label = "@string/app_name", MainLauncher = true, Icon = "@drawable/pdf_icon", HardwareAccelerated = true,
ConfigurationChanges = Android.Content.PM.ConfigChanges.ScreenSize | Android.Content.PM.ConfigChanges.Orientation | Android.Content.PM.ConfigChanges.KeyboardHidden,
WindowSoftInputMode = SoftInput.AdjustPan,
Theme = "@style/AppTheme")]
public class MainActivity : FragmentActivity
{
private pdftron.PDF.PDFViewCtrl mPdfViewCtrl;

protected override void OnCreate(Bundle bundle)
{
base.OnCreate(bundle);

// Set our view from the "main" layout resource
try
{
PDFNet.Initialize(this, Resource.Raw.pdfnet); // No license key, will produce water-marks
//PDFNet.Initialize(this, Resource.Raw.pdfnet, "your license key"); // Full version mode
// Disk caching should be disabled if write-external-storage is not permitted.
// To add the permission, add
// to the manifest file.
Console.WriteLine(PDFNet.GetVersion());
}
catch (PDFNetException e)
{
Console.WriteLine(e.GetMessage());
return;
}

SetContentView(Resource.Layout.Main);
mPdfViewCtrl = FindViewById(Resource.Id.pdfviewctrl);
mPdfViewCtrl.PagePresentationMode = PDFViewCtrl.PagePresentationModes.e_single_continuous;

// Load file from resource
Stream fis = this.Resources.OpenRawResource(Resource.Raw.sample);
PDFDoc docToOpen = new PDFDoc(fis);
mPdfViewCtrl.Doc = docToOpen;
}

protected override void OnPause()
{
base.OnPause();
if (mPdfViewCtrl != null)
{
mPdfViewCtrl.Pause();
}
}

protected override void OnResume()
{
base.OnResume();
if (mPdfViewCtrl != null)
{
mPdfViewCtrl.Resume();
}
}

protected override void OnDestroy()
{
base.OnDestroy();

if (mPdfViewCtrl != null)
{
mPdfViewCtrl.Destroy();
mPdfViewCtrl = null;
}
}

public override void OnLowMemory()
{
base.OnLowMemory();
if (mPdfViewCtrl != null)
{
mPdfViewCtrl.PurgeMemory();
}
}
}

}
  • Run the app

You can now run the app. If you run in the emulator, you will see the following. Note that the PDF can be scrolled and zoomed.

screenshot_1480980383

Part 3: Adding support for Annotations, Text Selection and Form Filling

PDFNet comes with built-in support for text selection, interactive annotation creation and editing, form filling and link following. These features have been implemented in an open source project using the PDFNet API, and are included as a project that builds the Tools.dll. Because the source is provided, implementers have complete flexibility and control to customize how users interact with the PDF so that it can fit their requirements exactly. To add support for annotations, text selection, etc:

  • Add the dependency

Add the Tools.dll to the References list.

  • Add the following lines as the last lines of the OnCreate method in MainActivity.cs
mToolManager = new ToolManager(mPdfViewCtrl);
mPdfViewCtrl.ToolManager = mToolManager;

You are now ready to run the project again. Now, when you run the project, you can select text, follow links and create and edit annotations. To create a new annotation, long press on an area of the document to trigger a popup with annotation types to create. This example behavior is shown in the below screenshot.

screenshot_1480980686

Part 4: Create customized Tools.dll from the open source Tools library

In the previous section you have seen how to use PDFViewCtrl with the tools add-on. However, if you wish to customize the Tools behavior, you will need to do so using the open source Java Tools project and then prepare the Android Archive file for building Tools.dll in Xamarin. This section demonstrates how to create Tools.dll from PDFViewCtrlTools.aar.

Please note that the following section will require some understanding on what an Android Archive is and how to build one. As well as what a Xamarin binding project is and how to build one. Please refer to this article for more information on Android Archive file. And please refer to this article for more information on Xamarin binding project.

You will need the PDFNet Android SDK package and PDFNet Android native binaries built for Xamarin.Android to proceed. It is available by request from here.

In addition, Xamarin.Android is becoming more and more strict on binding rules, users will no longer be able to create customized Tools.dll with only the Tools project. PDFNetAndroid binding project will also be required. This is because PDFNetAndroid binding project uses PDFNet.jar for internal binding but this PDFNet.jar needs to match the PDFNet.jar inside the PDFViewCtrlTools.aar. PDFNetAndroid binding project is available by request from here.

  • In the requested PDFNet Android SDK package, browse to /lib folder. Copy PDFNet.jar file to /lib/src/PDFViewCtrlTools/libs folder.
  • Import the PDFViewCtrlTools project to Android Studio and make the desired changes. Compile an Android Archive file PDFViewCtrlTools.aar.
  • Unzip PDFViewCtrlTools.aar and find PDFNet.jar file located in /libs folder of the unzipped PDFViewCtrlTools folder. This PDFNet.jar will be the EmbeddedJar used for PDFNetAndroid binding project.
  • In /lib/android/Native folder, add the PDFNet.jar file obtained from previous step. In the same directory, remove the PDFViewCtrlTools.aar file and replace it with the PDFViewCtrlTools.aar file obtained from first step.
  • In the requested PDFNet Android native binaries package, copy either full version or standard version of the native libraries to /lib/android/Native/nativeLib.
  • Browse to /projectSrc/PDFNetAndroidXamarin. Open the PDFNetAndroidXamarin.sln provided in Xamarin Studio or Visual Studio. Check the Build Action for PDFNet.jar file, and make sure it is set to EmbeddedJar. Check the Build Action for PDFViewCtrlTools.aar file, and make sure it is set to LibraryProjectZip. Set Build Action for all native binaries to EmbeddedNativeLibrary.
  • Clean and build PDFNetAndroid binding solution. A new PDFNetAndroid.dll and Tools.dll will be created in /lib/android/ folder. Use this new PDFNetAndroid.dll and Tools.dll in your application. Alternatively, you can also include the PDFNetAndroid project, FloatingActionButton project and PDFViewCtrlTools project in your application and make the binding project a project reference in your application project.

Part 5: Next steps

This concludes our introductory PDFNet for Xamarin.Android Tutorial. The completed tutorial project is available by request from here. For more help, please see the sample code, and other tutorials. You can also browse our public forum for more information about PDFNet. For details related to technical support, please refer to PDFTron support page.


Creating a Realtime PDF Annotation and Commenting System

$
0
0

Viewing a PDF directly in a web app is steadily becoming mainstream. PDFTron’s WebViewer, the leading, most reliable javascript PDF viewer, is now powering hundreds of apps around the web.

After viewing, the next step for many apps is enabling users to annotate the PDFs and communicate about them in real time, directly in the browser. WebViewer has indeed always supported this, because by importing and exporting annotations and comments using the standard PDF XML comment format, annotations can be instantly synchronized between users.

In this tutorial, we will step you through this process by

  • Setting up a new WebViewer instance
  • Synchronizing annotations between the clients and a server
  • Authenticating users, and enforcing permissions

For this tutorial, we will use Google Firebase as the server. It is quick to get started with and free to trial for an unlimited time. (Of course any server that can store data and trigger WebSocket events could be used in its place.)

Initial setup – HTML

  1. Download WebViewer SDK and unzip the package.
  2. Copy lib/ folder to a location on your web server.
  3. Create an HTML page.
  4. Add the following scripts to the div of the HTML page. WebViewer.js depends on jQuery so it must be included. Instead of including WebViewer.js you could include WebViewer.min.js which is a minified version of the file.
    <script src="jquery-1.7.2.min.js"></script>
    <script src="lib/WebViewer.js"></script>
    
  5. Add necessary scripts server methods. In this tutorial, we are going to include Firebase library and a separate file named server.js.
    <script src="https://www.gstatic.com/firebasejs/3.5.3/firebase.js"></script>
    <script src="server.js"></script>
    
  6. Add a script to initiate and use WebViewer.
    <script src="main.js"></script>
    
  7. Add a stylesheet to style the WebViewer element and some other user feedback elements.
    			<link rel="stylesheet" href="index.css">
    
  8. Create a div tag in the HTML body and give it an id. This will be the container for the WebViewer.
    <div id="viewer"></div>
    
  9. Create few more div tags in the HTML body as the following. These will be the used for a user to setup a name, or to show a returning user’s name.
    <div class="popup returning-user">
    <div class="greeting">Welcome back</div>
    <div class="name"></div>
    </div>
    <div class="popup new-user">
    <div class="greeting">Welcome! Tell us your name :)</div>
    <input class="name" autofocus />
    <div class="button">Start</div>
    </div>
    

Server – JavaScript

In realtime collaboration, a server will merely act as an online database that triggers events upon data creation/modification/deletion. As long as the above requirement is met, your server can be built in any language and stack of your choice. For the simplicity of this tutorial, we will be using Firebase.

  1. Go to the Firebase Console, login and create a project.
  2. Click “Add Firebase to your Web App” and copy the whole code for “Initializing Firebase”. If storageBucket is empty, close the popup and try again (that’s a known bug from Firebase).
  3. Create a JavaScript file and name it server.js.
  4. Paste the code that you have copied from Firebase. (Note that you should remove the script tags)
  5. Store the firebase.database.References for annotations and users. We will use these to create/update/delete data, and listen to data change events as well.
    window.Server = function() {
      var config = {
        apiKey: "YOUR_API_KEY",
        authDomain: "PROJECT_ID.firebaseapp.com",
        databaseURL: "https://PROJECT_ID.firebaseio.com",
        storageBucket: "PROJECT_ID.appspot.com",
        messagingSenderId: "YOUR_SENDER_ID"
      };
      firebase.initializeApp(config);
    
      this.annotationsRef = firebase.database().ref().child('annotations');
      this.authorsRef = firebase.database().ref().child('authors');
    };
    
  6. Create a custom bind function for authorization and data using firebase.auth.Auth#onAuthStateChanged and firebase.database.Reference#on.
    Server.prototype.bind = function(action, callbackFunction) {
      switch(action) {
        case 'onAuthStateChanged':
          firebase.auth().onAuthStateChanged(callbackFunction);
          break;
        case 'onAnnotationCreated':
          this.annotationsRef.on('child_added', callbackFunction);
          break;
        case 'onAnnotationUpdated':
          this.annotationsRef.on('child_changed', callbackFunction);
          break;
        case 'onAnnotationDeleted':
          this.annotationsRef.on('child_removed', callbackFunction);
          break;
        default:
          console.error('The action is not defined.');
          break;
      }
    };
    
  7. Define a method to check if author exists in the database. We will use firebase.database.Reference#once and firebase.database.DataSnapshot#hasChild to do so.
    Server.prototype.checkAuthor = function(authorId, openReturningAuthorPopup, openNewAuthorPopup) {
      this.authorsRef.once('value', function(authors) {
        if (authors.hasChild(authorId)) {
          this.authorsRef.child(authorId).once('value', function(author) {
            openReturningAuthorPopup(author.val().authorName);
          });
        } else {
          openNewAuthorPopup();
        }
      }.bind(this));
    };
    
  8. Define a sign-in method. In this tutorial, we will use firebase.auth.Auth#signInAnonymously.
    Server.prototype.signInAnonymously = function() {
      firebase.auth().signInAnonymously().catch(function(error) {
        if (error.code === 'auth/operation-not-allowed') {
          alert('You must enable Anonymous auth in the Firebase Console.');
        } else {
          console.error(error);
        }
      });
    };
    
  9. From the Firebase console click the “Authentication” button on the left panel and then click the “Sign-in Method” tab, just to the right of “Users”. From this page click the “Anonymous” button and choose to enable Anonymous login.
  10. Define data-write methods using firebase.database.Reference#set and firebase.database.Reference#remove.
    Server.prototype.createAnnotation = function(annotationId, annotationData) {
      this.annotationsRef.child(annotationId).set(annotationData);
    };
    
    Server.prototype.updateAnnotation = function(annotationId, annotationData) {
      this.annotationsRef.child(annotationId).set(annotationData);
    };
    
    Server.prototype.deleteAnnotation = function(annotationId) {
      this.annotationsRef.child(annotationId).remove();
    };
    
    Server.prototype.updateAuthor = function(authorId, authorData) {
      this.authorsRef.child(authorId).set(authorData);
    };
    
  11. Last but not least, you should add server-side permission rules for writing data. Although client-side permission checking is supported in WebViewer, every user does have access to each annotation’s information (including authorId and authorName). Thus, data-write permission should be regulated in the server as well. In this tutorial, we have used Firebase’s Database Rules.Copy the JSON below and paste it in your Firebase Console’s Database Rules. From the console click the “Database” button on the left panel and then click the “Rules” tab, just to the right of “Data”. This will make sure that trying to modify someone else’s annotation isn’t allowed.
    {
      "rules": {
        ".read": "auth != null",
    
        "annotations": {
          "$annotationId": {
            ".write": "auth.uid === newData.child('authorId').val() || auth.uid === data.child('authorId').val() || auth.uid === newData.child('parentAuthorId').val() || auth.uid === data.child('parentAuthorId').val()"
          }
        },
    
        "authors": {
          "$authorId": {
            ".write": "auth.uid === $authorId"
          }
        }
      }
    }
    

Client – JavaScript

  1. Create a JavaScript file and name it main.js.
  2. Instantiate WebViewer on a DOM element, making sure to wrap this code and any further code inside $(document).ready(). Initial document can be any PDF or XOD file.
    $(document).ready(function() {
      var viewerElement = document.getElementById('viewer');
      var myWebViewer = new PDFTron.WebViewer({
        type: "html5",
        path: "lib",
        initialDoc: "MY_INITIAL_DOC.pdf",
        documentId: "unique-id-for-this-document",
        enableAnnotations: true,
      }, viewerElement);
    
    });
    
  3. Create the server.
    var server = new Server();
    
  4. Bind a callback function to DocumentViewer.documentLoaded event. You will then be able to get annotationManager and access its methods.
    $(viewerElement).on('documentLoaded', function() {
      // show the notes panel by default
      myWebViewer.getInstance().showNotesPanel(true);
    
      var annotationManager = myWebViewer.getInstance().docViewer.getAnnotationManager();
      // Code in later steps will come here...
    });
    
  5. Inside the documentLoaded callback, bind another callback function to server’s onAuthStateChanged event that is defined in server.js. A firebase.User object will be passed as a parameter.
    1. If the user is not logged in we’ll call the sign-in method that we defined in server.js.
    2. If the user is logged in, we’ll store their uid in the authorId variable, which will be used for client-side annotation permission checks.
    3. We call server.checkAuthor with parameters authorId, openReturningUserPopup function and openNewUserPopup function. These functions will be discussed in next steps.
    4. Then, we will send author information to the server and bind callback functions to annotation events. Details of the callback functions will be discussed in next steps.
    var authorId = null;
    
    server.bind('onAuthStateChanged', function(user) {
      // User is logged in
      if (user) {
        // Using uid property from Firebase Database as an author id
        // It is also used as a reference for server-side permission
        authorId = user.uid;
        // Check if user exists, and call appropriate callback functions
        server.checkAuthor(authorId, openReturningAuthorPopup, openNewAuthorPopup);
        // Bind server-side data events to callback functions
        // When loaded for the first time, onAnnotationCreated event will be triggered for all database entries
        server.bind('onAnnotationCreated', onAnnotationCreated);
        server.bind('onAnnotationUpdated', onAnnotationUpdated);
        server.bind('onAnnotationDeleted', onAnnotationDeleted);
      }
      // User is not logged in
      else {
        // Login
        server.signInAnonymously();
      }
    });
    
  6. Define callback functions for annotationCreated, annotationUpdated and server.annotationDeleted events. A data object will be passed as a parameter. For more information, refer to firebase.database.DataSnapshot.
    1. openReturningAuthorPopup is a callback function triggered when author data is found in the database. It will receive authorName as a parameter, and open a popup with the authorName as a visual feedback.
    2. openNewAuthorPopup is a callback function triggered when author data is not found. Then we will open a popup for a new author to setup an author name.
    3. updateAuthor is a function which will set author name in both client and server using annotationManager.setCurrentUser and server.updateAuthor, respectively.
    function openReturningAuthorPopup(authorName) {
      annotationManager.setCurrentUser(authorName);
      $('.returning-author .name').html(authorName);
      $('.returning-author').css('display', 'block').click(function(e) {
        e.stopPropagation();
      });
      $('.popup-container').click(function() {
        $('.popup-container').css('display', 'none');
      });
      $('.popup-container').keypress(function(e) {
        if (e.which === 13) {
          $('.popup-container').css('display', 'none');
        }
      });
    }
    
    function openNewAuthorPopup() {
      // Open popup for a new author
      $('.new-author').css('display', 'block');
      $('.new-author .button').click(function() {
        var authorName = $('.new-author .name').get(0).value.trim();
        if (authorName) {
          updateAuthor(authorName);
        }
      });
      $('.popup-container').keypress(function(e) {
        var authorName = $('.new-author .name').get(0).value.trim();
        if (e.which === 13 && authorName) {
          updateAuthor(authorName);
        }
      });
    }
    
    function updateAuthor(authorName) {
      // The author name will be used for both WebViewer and annotations in PDF
      annotationManager.setCurrentUser(authorName);
      // Create/update author information in the server
      server.updateAuthor(authorId, { authorName });
      $('.popup-container').css('display', 'none');
    }
    
  7. Define callback functions for annotationCreated, annotationUpdated and server.annotationDeleted events. A data object will be passed as a parameter. For more information, refer to firebase.database.DataSnapshot.
    1. onAnnotationCreated and onAnnotationUpdated have the exact same behavior in this tutorial. They will use annotationManager.importAnnotCommand to update the viewer with the xfdf change.
    2. We also set a custom field authorId for the updated annotation to control client-side permission of the created/updated annotation.
    3. onAnnotationDelete creates a delete command string from the annotation’s id and is simply able to call importAnnotCommand on it.
    function onAnnotationCreated(data) {
      // data.val() returns the value of server data in any type. In this case, it
      // would be an object with properties authorId and xfdf.
      var annotation = annotationManager.importAnnotCommand(data.val().xfdf)[0];
      annotation.authorId = data.val().authorId;
      annotationManager.redrawAnnotation(annotation);
      myWebViewer.getInstance().fireEvent('updateAnnotationPermission', [annotation]);
    }
    
    function onAnnotationUpdated(data) {
      var annotation = annotationManager.importAnnotCommand(data.val().xfdf)[0];
      annotation.authorId = data.val().authorId;
      annotationManager.redrawAnnotation(annotation);
    }
    
    function onAnnotationDeleted(data) {
      // data.key would return annotationId since our server method is designed as
      // annotationsRef.child(annotationId).set(annotationData)
      var command = '<delete><id>' + data.key + '</id></delete>';
      annotationManager.importAnnotCommand(command);
    }
    
  8. After server callback functions are bound, we’ll also bind a function to annotationManager.annotationChanged event.
    1. First parameter, e, has a property imported that is set to true by default for annotations internal to the document and annotations added by importAnnotCommand.
    2. Then we iterate through the annotations that are changed, which is passed as the second parameter.
    3. Third parameter, type, defines which action it was. In this tutorial, we’ll have the same behavior for both add and modify action types.
    4. When annotations are added and modified, we will call server.createAnnotation or server.updateAnnotation which needs four variables: annotationId, authorId, parentAuthorId and xfdf.
    5. annotationId can be retrieved from annotation.Id.
    6. authorId was saved as a reference when user logged in.
    7. parentAuthorId refers to the parent annotation’s author id, if any. This will be used to distinguish replies, and will be referenced in server-side permission. Thus, we retrieve authorId of the parent annotation by using annotation.InReplyTo, which returns the annotation id of the parent annotation.
    8. xfdf can be retrieved using annotationManager.getAnnotCommand. It will get an XML string specifying the added, modified and deleted annotations, which can be used to import the annotation using annotationManager.importAnnotCommand in server data callback functions.
    annotationManager.on('annotationChanged', function(e, annotations, type) {
      if (e.imported) {
        return;
      }
      annotations.forEach(function(annotation) {
        if (type === 'add') {
          var xfdf = annotationManager.getAnnotCommand();
          var parentAuthorId = null;
          if (annotation.InReplyTo) {
            var parentAuthorId = annotationManager.getAnnotationById(annotation.InReplyTo).authorId;
          }
          server.createAnnotation(annotation.Id, { authorId, parentAuthorId, xfdf });
        } else if (type === 'modify'){
          var xfdf = annotationManager.getAnnotCommand();
          var parentAuthorId = null;
          if (annotation.InReplyTo) {
            var parentAuthorId = annotationManager.getAnnotationById(annotation.InReplyTo).authorId;
          }
          server.updateAnnotation(annotation.Id, { authorId, parentAuthorId, xfdf });
        } else if (type === 'delete') {
          server.deleteAnnotation(annotation.Id);
        }
      });
    });
    
  9. Lastly, we will overwrite the client-side permission checking function using annotationManager.setPermissionCheckCallback. The default is set to compare the authorName. Instead, we will compare authorId created from the server.
    annotationManager.setPermissionCheckCallback(function(author, annotation) {
      return annotation.authorId === authorId;
    });
    

Styling – CSS

Width and height of the WebViewer element must be specified with css. Styling for author name popup is also added in this tutorial. Create a file named index.css with the following content.

html {
  width: 100%;
  height: 100%;
}

body {
  width: 100%;
  height: 100%;
  padding: 0;
  margin: 0;
  overflow: hidden;
}

#viewer {
  width: 100%;
  height: 100%;
  overflow: hidden;
}

.popup-container {
  width: 100%;
  height: 100%;

  position: fixed;
  left: 0;
  top: 0;

  background: rgba(0, 0, 0, 0.5);
}

.popup {
  display: none;

  padding: 30px;
  border-radius: 10px;

  position: absolute;
  left: 50%;
  top: 50%;
  transform: translate(-50%, -50%);

  background: rgba(255, 255, 255, 1);
  box-shadow: 0 1px 10px rgba(0, 0, 0, 1);
  font-family: Verdana;
  text-align: center;
  line-height: 2em;
}

.greeting {
  margin-bottom: 10px;
}

.name {
  font-size: 25px;
  font-weight: bold;
}

.new-author .name {
  float: left;

  width: 170px;
  height: 50px;
  padding: 0 20px;
  border: 1px black;
  border-style: solid none solid solid;
  border-radius: 5px 0 0 5px;

  outline: none;
}

.new-author .button {
  float: right;

  height: 50px;
  padding: 0 20px;
  border: 1px solid black;
  border-radius: 0 5px 5px 0;

  background: white;
  cursor: pointer;
  line-height: 50px;
}

Conclusion

At this point you should be able to have multiple people access the HTML page from your server and add/modify/delete annotations in real time. To test it out yourself you could try opening it in multiple browsers or in an incognito window to simulate multiple users.


Getting Started with Android

$
0
0

Introduction

PDFNet Mobile SDK for Android brings the full power of the PDFNet library to Android devices. The SDK ships with simple to use Java APIs that allow developers to seamlessly integrate PDF viewing, creation, searching, annotation, and editing capabilities with their Android apps. This document will explain how to download the SDK, describe its basic components, step you through creating a simple PDF viewing app, and point you towards helpful resources. It is organized into the following sections:

First Steps
Creating a basic PDF viewer
Adding support for Annotations, Text Selection and Form Filling
Opening encrypted documents
FAQ
Additional Resources 

First Steps

If you have not done so already, download the SDK for Android, available by request on the PDFTron website. There are three package samples in the SDK (please follow the instructions on the readme document included in the packages for running each application):

  • PDFDrawDemo
  • MiscellaneousSamples
  • PDFViewCtrlProjects

PDFDrawDemo is a simple example that shows how PDFDraw can be used to make a simple PDF viewer. Tap top (bottom) half of the viewer to turn pages backward (forward). To see how to use all PDFNet APIs, check out MiscellaneousSamples application.

The PDFViewCtrlProjects package itself consists of two applications:

  • PDFViewCtrlDemo
  • CompleteReader

The PDFViewCtrlDemo sample application showcases a basic PDF viewer and editor, where it is possible to perform annotations (lines, arrows, free text, sticky notes, etc.), fill forms, zoom in/out, browse bookmarks, highlight/underline text, navigate the document with different presentation modes and much more. Feel free to browse the project and see how the control is used in the layout xml file, or how you can initialize it programmatically.

pdfviewctrldemo

PDFViewCtrlDemo in action

CompleteReader is an advanced version of PDFViewCtrlDemo, including a file browser with five different file views, by folder, by listing all files, by SD card (Lollipop+), a list of favorite documents and a list of the last accessed documents (including thumbnails), and a viewer activity that uses the controls available in the Tools library, such as the bookmark list, annotation list, annotation toolbar, thumbnail slider, and thumbnails viewer. Besides this, it also includes reflow, text search, sharing, annotations, page editing, page cropping, custom color mode, support to right-to-left languages, undo/redo and much more.

completereader

CompleteReader in action

OK, the sample apps are fantastic! But how do I use the SDK with my app?

Before proceeding, it’s worth knowing what makes up the core components of PDFNet for Android. Typically you only need to include 4 items in your project (we highly recommend not change the names):

Capture2

PDFNet.jar

This jar file contains all the JAVA APIs available to be used by your application. You can find the documentation online or in the “docs” folder of the SDK package.

libPDFNetC-v7a.so

This file contains the core functionality of PDFNet and it is the heart of the SDK. The file is built separately for each architecture, and currently it is available for ARM, ARM-v7, ARM64-v8a, x86 and x86-64. (The example above uses only the ARM-v7 version of the library to keep the final APK size to a minimum.)

pdfnet.res

This file contains standard PDF resources that must be included to ensure all PDF documents will display correctly. Please check “What is the pdfnet.res file?” for more information on the resource file.

Tools Library

The preceding files are sufficient for viewing PDF documents. The Tools library adds support for text selection, interactive annotation creation and editing, form filling, link following, and etc. Please check “What is the Tools library?” for more information.

Creating a basic PDF viewer

Now that you understand the components of the SDK, let’s build a bare bones PDF viewing app. As secondary steps we will also add support for annotation creation and editing, and opening encrypted documents.

In this example we will use Android Studio 2.3, however if you are already comfortable with another IDE then feel free to use it as well. A completed version of the project can be found on our GitHub repository.

Displaying a PDF

1. Create a new Android project and add the required PDFNet files

Start a New Project in Android Studio and fill the Application name as pttest and Company domain as tutorial.android.pdftron.com (so the package name will be com.pdftron.android.tutorial.pttest.) In the next dialog set Minimum SDK as API 16 and finally select Add No Activity and press Finish button.

Next we need to add the required PDFNet files to the project (it is assumed that you already downloaded the SDK package, which contains the required files. If not, please check the First Steps section at the beginning of this document):

  • Copy PDFNet.jar from the downloaded package’s “lib” folder to the “libs” folder of your project;
  • Copy libPDFNetC-v7a.so from the downloaded package’s “lib\standard\armeabi-v7a” to the project’s “jniLibs\armeabi-v7a” folder (note that you might need to create this folder on your project);
  • Copy pdfnet.res from the package’s “resource” folder to the project’s “res\raw” folder;
  • Add a PDF document named “sample.pdf” to the “res\raw” folder.

Your project should look like the screenshot below:

Capture

Adding PDFNet files to the project

For this project we will add only the ARM-v7 library (to keep the final APK file to a minimum size), but you could also add the ARM and x86 variants to their respective folders.

2. Update build.gradle

For this project we only need to add ARM-v7 architecture to build.gradle. Make sure that you add this to the module gradle (the build.gradle file in the “pttest\app\” folder):

    productFlavors {
        armv7a {
            ndk {
                abiFilters "armeabi-v7a"
            }
        }
    }

So the entire build.gradle file in “pttest\app” folder will be like this:

apply plugin: 'com.android.application'

android {
    compileSdkVersion 25
    buildToolsVersion "25.0.2"
    defaultConfig {
        applicationId "com.pdftron.android.tutorial.pttest"
        minSdkVersion 16
        targetSdkVersion 25
        versionCode 1
        versionName "1.0"
        testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
    }
    buildTypes {
        release {
            minifyEnabled false
            proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'
        }
    }
    productFlavors {
        armv7a {
            ndk {
                abiFilters "armeabi-v7a"
            }
        }
    }
}

dependencies {
    compile fileTree(dir: 'libs', include: ['*.jar'])
    androidTestCompile('com.android.support.test.espresso:espresso-core:2.2.2', {
        exclude group: 'com.android.support', module: 'support-annotations'
    })
    compile 'com.android.support:appcompat-v7:25.2.0'
    testCompile 'junit:junit:4.12'
}

3. Add a layout for our activity

Now let’s add a layout called main.xml into the “res\layout” folder which will contain the PDFViewCtrl element:

<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
              android:orientation="vertical"
              android:layout_width="match_parent"
              android:layout_height="match_parent">
    <com.pdftron.pdf.PDFViewCtrl
        android:id="@+id/pdfviewctrl"
        android:layout_width="match_parent"
        android:layout_height="match_parent"
        android:scrollbars = "vertical|horizontal" />
</LinearLayout>

4. Add code to show PDF

Now create a new class that extends android.app.Activity called PTTestActivity, using the package com.pdftron.android.tutorial.pttest:

package com.pdftron.android.tutorial.pttest;

import android.app.Activity;
import android.content.res.Resources;
import android.os.Bundle;

import com.pdftron.common.PDFNetException;
import com.pdftron.pdf.PDFDoc;
import com.pdftron.pdf.PDFNet;
import com.pdftron.pdf.PDFViewCtrl;
import com.pdftron.pdf.tools.ToolManager;

import java.io.IOException;
import java.io.InputStream;

public class PTTestActivity extends Activity {

    private PDFViewCtrl mPDFViewCtrl;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        // Initialize the library
        try {
            PDFNet.initialize(this, R.raw.pdfnet);
        } catch (PDFNetException e) {
            // Do something...
            e.printStackTrace();
        }

        // Inflate the view and get a reference to PDFViewCtrl
        setContentView(R.layout.main);
        mPDFViewCtrl = (PDFViewCtrl) findViewById(R.id.pdfviewctrl);

        // Load a document
        Resources res = getResources();
        InputStream is = res.openRawResource(R.raw.sample);
        try {
            PDFDoc doc = new PDFDoc(is);
            mPDFViewCtrl.setDoc(doc);
            // Or you can use the full path instead
            //doc = new PDFDoc("/mnt/sdcard/sample.pdf");
        } catch (PDFNetException | IOException e) {
            e.printStackTrace();
        }
    }

    @Override
    protected void onPause() {
        // This method simply stops the current ongoing rendering thread, text
        // search thread, and tool
        super.onPause();
        if (mPDFViewCtrl != null) {
            mPDFViewCtrl.pause();
        }
    }

    @Override
    protected void onResume() {
        // This method simply starts the rendering thread to ensure the PDF
        // content is available for viewing.
        super.onResume();
        if (mPDFViewCtrl != null) {
            mPDFViewCtrl.resume();
        }
    }

    @Override
    protected void onDestroy() {
        // Destroy PDFViewCtrl and clean up memory and used resources.
        super.onDestroy();
        if (mPDFViewCtrl != null) {
            mPDFViewCtrl.destroy();
        }
    }

    @Override
    public void onLowMemory() {
        // Call this method to lower PDFViewCtrl's memory consumption.
        super.onLowMemory();
        if (mPDFViewCtrl != null) {
            mPDFViewCtrl.purgeMemory();
        }
    }
}

5. Configure the manifest file

The last step is to configure the manifest file to start our PTTestActivity. Change the AndroidManifest.xml to the content below:

<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    package="com.pdftron.android.tutorial.pttest">
    <application android:allowBackup="true"
        android:label="@string/app_name"
        android:icon="@mipmap/ic_launcher"
        android:roundIcon="@mipmap/ic_launcher_round"
        android:supportsRtl="true"
        android:theme="@style/AppTheme">
        <activity
            android:name="PTTestActivity"
            android:windowSoftInputMode="adjustPan" >
            <intent-filter>
                <action
                    android:name="android.intent.action.MAIN" />
                <category
                    android:name="android.intent.category.LAUNCHER" />
            </intent-filter>
        </activity>
    </application>
</manifest>

6. Run the app

Now if you build and run the sample project on a device, you will be able to scroll and zoom pages (pinch or double tap to zoom). You are also able to turn pages by fling to left or right.

fling

PTTest in action

Adding support for Annotations, Text Selection and Form Filling

PDFNet comes with built-in support for text selection, interactive annotation creation and editing, form filling, link following and etc. These features are implemented using PDFNet’s Android API, and are shipped in a separate library, PDFViewCtrlTools (see “What is the Tools library?” for more information).

To add support for annotations, text selection, etc, you need to reference the Tools library in your project. The SDK package is included with the PDFViewCtrlTools source package (existing in the “lib\src\PDFViewCtrlTools\” folder) and PDFViewCtrlTools library (which is “lib\PDFViewCtrlTools.aar”). You can either add the source package or include the library into your workspace.

A- Add PDFViewCtrlTools source package

If you decide on adding the source packagae, you first need to delete “app\libs\PDFNet.jar” (PDFViewCtrlTools package has its own PDFNet.jar file). Now, copy all folders including PDFViewCtrlTools, PageCropper and FloatingActionButton in the “lib\src\” folder from SDK package to the root folder of pttest. Our project should look like the screenshot below:

Capture3

Now, we need to include the external projects in “settings.gradle” file:

include ':PDFViewCtrlTools'
include ':FloatingActionButton'
include ':PageCropper'
include ':app'

and include dependency to PDFViewCtrlTools in build.gradle:

dependencies {
    compile project(':PDFViewCtrlTools')
}

At this point you should be able to build your project again. Make sure it compiles without errors (note that sometimes you will need to trigger the build action more than once so Android will generate the intermediate files appropriately).

B- Include PDFViewCtrlTools library

Instead of adding source package, you are able to include PDFViewCtrlTools AAR library that is provided in the SDK package (in the “\lib\” folder). First delete “app\libs\PDFNet.jar” (PDFViewCtrlTools library has its own PDFNet.jar file). Then, copy all aar files (including PDFViewCtrlTools.aar, FloatingActionButton.aar and PageCropper.aar) from the SDK package in the “\lib\” folder to the “app\libs\” folder. You also need to add the followings to the “\app\build.gradle” file:

allprojects {
    repositories {
        flatDir {
            dirs 'libs'
        }
    }
}

dependencies {
    compile(name:'PDFViewCtrlTools', ext:'aar')
    compile(name:'FloatingActionButton', ext:'aar')
    compile(name:'PageCropper', ext:'aar')
}

At this point you should be able to build your project again.

Now that PDFViewCtrlTools is added to the project, you can use Tool Manager in your code. In this order, add the call to set the tool manager just after the call to findViewById(R.id.pdfviewctrl):

// Inflate the view and get a reference to PDFViewCtrl
setContentView(R.layout.main);
mPDFViewCtrl = (PDFViewCtrl) findViewById(R.id.pdfviewctrl);

mPDFViewCtrl.setToolManager(new ToolManager(mPDFViewCtrl));

You are now ready to build and run the project again. Now, when you run the project, you can select text, follow links and create and edit annotations. To create a new annotation, long press on an area of the document without text to trigger a popup with annotation types to create. This behavior is shown in the below screenshot:

sample

Opening encrypted documents

PDFNet supports opening encrypted PDF documents. If you try to open an encrypted document, PDFViewCtrl will pop up a dialog to request the password from the user. If you would like to implement your own interface or system for acquiring the password, you can first initialize the PDFDoc’s security handler before passing it to the PDFViewCtrl. An example of this is shown following code snippet after creating the PDFDoc in order to display an encrypted PDF:

if (!doc.initStdSecurityHandler("password")) {
    // Wrong password...
    return;
}

FAQ

What is the Tools library?
What are the differences between the “Standard” and “Full” libraries?
What is the pdfnet.res file?
How do I configure my ProGuard configuration file to properly obfuscate PDFNet?
How do I add views/widgets on top of PDFViewCtrl?

What is the Tools library?

PDFNet comes with built-in support for text selection, interactive annotation creation and editing, form filling and link following. These features have been implemented using two interfaces from PDFViewCtrl (PDFViewCtrl.ToolManager and PDFViewCtrl.Tool), and are shipped in a separate Android library project, PDFViewCtrlTools. This library also contains an implementation of a floating quick menu to access all these functionalities, a basic slider to navigate through pages and a text search toolbar.

With the source code of the library developers have complete flexibility and control to customize how users interact with the PDF. More information on how the tools work and how to customize them for your app can be found in our documentation:

http://www.pdftron.com/pdfnet/mobile/docs/Android/pdfnet/javadoc/reference/com/pdftron/pdf/PDFViewCtrl.ToolManager.html

How make use of the PDFNet Android PDFViewCtrl’s Tool and ToolManager interface:

https://groups.google.com/d/msg/pdfnet-sdk/fG-20n1gcPU/4Zslh603PZ8J

What are the differences between the “Standard” and “Full” libraries?

The SDK package includes two versions of the native libraries: standard and full.

In order to help our customers to create applications with a smaller file size, the standard version omits the following rarely used features (which are present in the full version):

  • Convert, Optimizer, Redactor and Flattener classes;
  • PDF/A validation/conversion;
  • Converting PDF pages to TIFF and PNG formats (i.e., PDFDraw will not work when using these formats).

Rendering speed and quality are the same for both versions of the library.

What is the pdfnet.res file?

This file contains fonts, CMaps and other standard PDF resources that are needed for the correct displaying of generic PDF documents (e.g., forms, text, free text annotations). If your documents are largely images and do not have text or annotations, then it may not be necessary. There are two ways to properly use this file in your project:

  1. PDFNet.initialize() method includes an extra parameter that can be used to load the resource file from the application bundle. For example:

PDFNet.initialize(this, R.raw.pdfnet);

This method will copy the file from “res/raw/pdfnet.res” to the private storage area of the application, and an exception will be thrown if there is no sufficient space to save the file, or if the resource ID can’t be found. Please note that the resource file must be named “pdfnet.res” when using this approach.

  1. You can call PDFNet.initialize() without the resource parameter, and use:

PDFNet.setResourcesPath(“path/to/resources/file”);

With this method you are handling installation of the resource file on your own. For example, the resource file can be downloaded on demand and saved at any location. When the resource file is ready for use it can be loaded using setResourcesPath().

How do I configure my ProGuard configuration file to properly obfuscate PDFNet?

Due to the nature of the SDK in using native and managed code, you will need to tell ProGuard not to obfuscate some classes. You need to include the following in your config file:

  • -keep class com.pdftron.pdf.PDFViewCtrl$RenderCallback { *; }
  • -keep class com.pdftron.pdf.PDFViewCtrl$PrivateDownloader { *; }
  • -keep class com.pdftron.filters.CustomFilter$CustomFilterCallback { *; }
  • -keep class com.pdftron.pdf.PDFViewCtrl$LinkInfo { *; }
  • -keep class com.pdftron.pdf.ProgressMonitor { *; }
  • -keep class com.pdftron.sdf.ProgressMonitor { *; }
  • -keep class com.pdftron.common.PDFNetException { *; }
  • -keep class com.pdftron.pdf.PreviewHandler { *; }
  • -keep class com.pdftron.pdf.RequestHandler { *; }
  • -keep class com.pdftron.pdf.TextSearchResult { *; }

How do I add views/widgets on top of PDFViewCtrl?

Since PDFViewCtrl extends a ViewGroup, it is possible to add views or widgets on top of a page programmatically using PDFViewCtrl.addView(…). One important point when using this approach is that you will have to set the layout of the view to be inserted (i.e., view.layout(…)), since PDFViewCtrl does not handle the views like a LinearLayout or RelativeLayout. Also, depending on your requirements you will have to extend PDFViewCtrl to be able to override the touch event methods and perform actions on the view(s) added.

Another option is to use a different view/layer and position it on top of PDFViewCtrl through the layout xml file.

More info:

https://groups.google.com/d/msg/pdfnet-sdk/s99GQwKiRLc/0XSO6OHuVrUJ

Additional Resources

Samples

CompleteReader sample in the SDK package is a light version of our Xodo app in the play store. The CompleteReader sample has almost all functionalities of Xodo app except collaboration. Feel free to download and run Xodo and compare it with CompleteReader!

Knowledge Base

Browse our public forum for more information about PDFNet:

https://groups.google.com/forum/#!forum/pdfnet-sdk

Android related posts can be found using:

https://groups.google.com/forum/#!searchin/pdfnet-sdk/android

Support

For seeing our support page please go here:

http://www.pdftron.com/support

If you think you have encountered a bug, or issue, with PDFNet, you can submit a ticket. Simply fill out the following form:

http://www.pdftron.com/support/reportproblem.html


Viewing all 20 articles
Browse latest View live