Technical FAQs

Question

If I have a PDF document that only has an embedded image in it (no text objects, etc.), can PrizmDoc Viewer take it and create a searchable PDF file from it?

Answer

Yes. PrizmDoc’s Content Conversion Services can take an image-only PDF and create a searchable PDF file from it. This can be done by modifying the input.dest.pdfOptions.ocr options object; see our documentation here.

If you are attempting to make a searchable PDF from an existing PDF document, please note that the source PDF file should be an image-only PDF. PrizmDoc will not create a searchable file from already-existing vector content.

This feature was introduced in PrizmDoc 13.1, please see our Release Notes for more information.

Question

If I have a PDF document that only has an embedded image in it (no text objects, etc.), can PrizmDoc Viewer take it and create a searchable PDF file from it?

Answer

Yes. PrizmDoc’s Content Conversion Services can take an image-only PDF and create a searchable PDF file from it. This can be done by modifying the input.dest.pdfOptions.ocr options object; see our documentation here.

If you are attempting to make a searchable PDF from an existing PDF document, please note that the source PDF file should be an image-only PDF. PrizmDoc will not create a searchable file from already-existing vector content.

This feature was introduced in PrizmDoc 13.1, please see our Release Notes for more information.

On March 10, 2021, Accusoft announced the arrival of the free-to-use Accusoft PDF Viewer, the latest addition to its family of PDF solutions. An entirely client-side integration with no complicated server dependencies, this lightweight JavaScript PDF viewer also features a responsive UI for out-of-the-box mobile support.

“We’re excited to offer this free version of the Accusoft PDF Viewer to developers,” says Jack Berlin, CEO of Accusoft. “Our team worked hard to build a viewer that’s a step above what you can get from open source offerings. We think it’s going to solve a lot of the problems developers typically encounter with existing PDF libraries.”

Accusoft PDF Viewer integrates into an application quickly and easily with just a few snippets of code. It runs entirely within the browser to deliver an optimized viewing experience across all devices. The intuitive UI controls allow users to zoom, pan, jump to page, navigate thumbnails, and pinch-to-zoom on mobile screens with ease. And thanks to lightning fast full-text search, locating essential information is easier than ever.

“Accusoft PDF Viewer is great for developers because it allows them to maintain complete control over documents without having to set up any cumbersome server infrastructure,” says Mark Hansen, Product Manager. “Having a responsive UI that adapts to mobile displays will also increase their flexibility tremendously.”

The free version of Accusoft PDF Viewer allows developers to quickly add powerful viewing capabilities to their web applications. We’re currently working on additional features (such as annotation and eSignature) that will be included in an upgraded paid version.

To learn more about Accusoft PDF Viewer or download it for a first-hand look, please visit our website.

About Accusoft:
Founded in 1991, Accusoft is a software development company specializing in content processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

On August 3, 2021, Accusoft announced the release of the paid Professional version of Accusoft PDF Viewer. Initially released in March of 2021, the Standard version of Accusoft PDF Viewer is a free-to-use, lightweight JavaScript PDF library featuring a responsive UI for out-of-the-box mobile support. The new Professional version adds enhanced PDF tools and document functionality without introducing any complex server dependencies that could impact application security or performance.

“We’ve received tremendous feedback so far regarding the Standard version of Accusoft PDF Viewer,” says Jack Berlin, CEO of Accusoft. “With the release of the paid Professional version, customers now have a clear upgrade path that allows them to add new features without having to rethink their application architecture.”

Key Accusoft PDF Viewer Professional features include:

  • Multiple Annotation Types
  • Customizable UI
  • White Labeling
  • Electronic Signature

As an entirely client-side integration, Accusoft PDF Viewer can be incorporated into any web application with just a few lines of code. The paid Professional version features the same intuitive UI controls that provide an optimized viewing experience across all screen types, making it ideal for web apps that need to run on both desktop and mobile devices.

“We did a lot of research to determine which features are most important to developers,” says Mark Hansen, Product Manager at Accusoft. “The ability to markup and electronically sign documents without having to rely on external servers or backend processing is going to be a gamechanger for a lot of applications.”

To learn more about the latest Accusoft PDF Viewer features, please visit our website.

About Accusoft: 

Founded in 1991, Accusoft is a software development company specializing in content processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

###

Anyone who has watched a thriller about government secrecy probably has an image in mind about what it means to redact a document. That picture usually involves piles of classified pages with entire paragraphs blotted out with black marker. At some point, a character holds a sheet up to a light and finds a spot where the redacted text is just barely visible enough to provide them with the next clue that moves the story forward. They may even use some special form of scanner that allows them to see the hidden material.

Such scenes reveal the fundamental problem with text redaction. As long as the content remains present, there might be some way of making it visible again, which presents serious problems in terms of privacy and security. The transition to purely digital documents should have made these concerns a thing of the past. Unfortunately, too many people fail to take advantage of PDF redaction tools and leave their confidential material dangerously exposed.

PDFs Are Not Like Physical Documents

In 2016, Democrats in the U.S. House of Representatives made the embarrassing mistake of releasing a cache of documents that contained improper redactions. Journalists easily found what was hidden beneath the black markings by copying the PDF text and pasting it into another document, which instantly revealed the redacted material.

This was not the first time government officials, or other organizations, released improperly redacted documents. Part of the reason why this mistake keeps happening is that people frequently apply the same practices used with physical documents to digital documents. It’s a simple matter to use shapes or drawing tools to obscure text in a PDF, but doing so only hides the content from view rather than removing it altogether.

As the “copy and paste” trick described above shows, it’s often trivially easy to bypass such “redactions.” That’s because a PDF document is not like a physical, printed document, even though it resembles one in a viewer. A PDF consists of multiple layers, as well as extensive metadata that isn’t visible. Adding a black box over text simply adds another layer to the document. Accessing the layer of text information underneath is quite simple, even with relatively basic software tools.

Redacting Content from Electronic Documents

The first step in true redaction involves the removal of selected content entirely. This ensures that even if someone is able to extract the text layer from the document, the redacted portions will not become visible when pasted elsewhere.

However, even removing the visible text itself may not be enough to protect confidential information. That’s because there may be some data remaining in the document that could contain information about how to render the redacted portions. While it would be possible to avoid this problem by converting a PDF to a bitmap image, removing the portions to be redacted, and then building an entirely new document using OCR, this process is time consuming and difficult to scale.

Using PDF Redaction Tools in PrizmDoc Viewer

A much more efficient approach would be to utilize dedicated PDF redaction tools like those built into PrizmDoc Viewer. Thanks to a sophisticated and intuitive API, PrizmDoc allows users to perform a number of redaction functions within its easy-to-use HTML5 viewer:

  • Add individual redactions by selecting text, applying a redaction rectangle, or marking out the whole page.
  • Perform a search for specific terms and apply redactions to each instance.
  • Add redaction layers to a document that can be saved and edited during preparation.
  • Apply redaction reasons to explain why certain content has been removed.

When integrating PrizmDoc Viewer into their applications, developers can also customize the HTML5 viewer to apply predefined redactions, preload entire redaction layers, or create unique redactions programmatically. This is especially useful for high-volume document workflows that need to identify and remove commonly used private data like Social Security numbers, contact information, and financial information.

PrizmDoc Viewer’s redaction API strips out all information associated with the redacted material from the document. That means any removed content isn’t just no longer visible; it also can’t be highlighted, copied, searched, or indexed because it’s no longer present in any way. Remaining text content, however, is still readily available. Even better, sharing documents through the HTML5 viewer also hides metadata that could contain sensitive information.

When redactions are made, PrizmDoc Viewer allows users to indicate the reasons for these removals. This is especially important for transparency purposes when working with government documents. The redaction API supports single and multiple redaction reasons for improved clarity.

Of course, most organizations still need to retain access to unredacted documents for internal use. That’s why PrizmDoc Viewer retains an unaltered version of the document safely uploaded to the server. The actual redacted document is a new file with all redacted content removed. Users can then use PrizmDoc Viewer’s sharing controls to further manage access to the file.

Redact Your Documents the Right Way

Today’s applications can’t afford to take redaction lightly. Whether they’re building the next generation of government technologies or LegalTech applications, developers need to provide their customers with the ability to easily screen documents to protect sensitive and private information from being exposed. By integrating viewing and document editing solutions with PDF redaction tools, they can help organizations take control over document security and avoid embarrassing redaction mistakes that could expose them to severe liability.

PrizmDoc Viewer’s versatile HTML5 viewing capabilities leverage powerful APIs to easily incorporate document redaction into application workflows. With just a simple API call, users can quickly locate and remove information from documents before sharing them with anyone outside the organization. To see PrizmDoc Viewer’s PDF redaction tools first hand, check out our interactive online demo today.

Question

We are converting emails into PDFs using PrizmDoc. When the PDF is viewed in PrizmDoc, if you hover over the names in the email header, you see a mailto link that provides the email address.

Is there a way to remove those links during the conversion process? We wish to ensure there are no email addresses present in the PDFs.

Answer

To work around this issue, you can first convert the email (MSG) to a TIFF file. This will remove the links and just keep the name of the email recipient. Then convert the TIFF file to a searchable PDF.

This workaround requires that your PrizmDoc license has the OCR option enabled to create the searchable PDF. If you do not need to make the text searchable, then you can just convert the TIFF to a PDF.

Question

What are the technical details/process of “Flattening” a PDF document?

Answer

It is possible to “Flatten” PDF documents in PrizmDoc Viewer. You can do this by converting the document to a raster format (TIFF is recommended for PDF conversion) using PrizmDoc’s Content Conversion Service, and then converting it back to PDF format. This will result in a PDF with a single layer and no hidden objects. However, this will usually lower the quality and increase the file size of PDFs that are largely text.

Here is an example workflow using the Workfile API and the Content Conversion Service API:

1. Create a WorkFile from PDF

POST {{pccisUrl}}/PCCIS/V1/WorkFile
Content-Type: application/octet-stream

{{file bytes}}

2. Initiate Conversion to TIFF

POST {{pccisUrl}}/v2/contentConverters
Content-Type: application/json

{
    "input": {
        "sources": [
            {
                "fileId": "{{fileId}}"
            }
        ],
        "dest": {
            "format": "tiff"
        }
    }
}

3. Poll until response[“state”] === “complete”

GET {{pccisUrl}}/v2/contentConverters/{{processId}}

4. Initiate Conversion from TIFF back to PDF

POST {{pccisUrl}}/v2/contentConverters
Content-Type: application/json

{
    "input": {
        "sources": [
            {
                "fileId": "{{fileId_from_Step3_output}}"
            }
        ],
        "dest": {
            "format": "pdf"
        }
    }
}

5. Poll again

GET {{pccisUrl}}/v2/contentConverters/{{processId}}

6. Download

GET {{pccisUrl}}/PCCIS/V1/WorkFile/{{fileId}}?ContentDispositionFileName={{desiredFileNameWithExtension}}

Sept. 7, 2022 – TAMPA, Fla.Accusoft, a software development company specializing in content processing, conversion, and automation solutions, and Snowbound, a leader in document viewing and conversion SDK solutions, announced today that they have entered into a definitive agreement under which Accusoft will acquire Snowbound. In the largest acquisition in its 30-year history, the transaction will significantly expand Accusoft’s presence and product portfolio.

Snowbound’s VirtualViewer® technology, supported by its powerful RasterMaster® SDK, supports numerous formats including PDF, MS Office, AFP, DWG, TIFF, email, video, audio files, and more within one universal interface. Its REST API and RESTful content handler provide a more flexible development and deployment capability enabling it to be easily integrated into most applications. In addition, the company offers connectors for IBM FileNet, Alfresco, and Pega. This acquisition will enable Accusoft to expand into new viewing and collaboration technologies offering customers a more robust web-based document viewing experience. 

“Today, we celebrate the joining of two companies who have both driven significant innovation for web-based viewing, conversion, and imaging SDK technologies. I have always had the utmost respect for Snowbound’s leadership team and their employees as we have competed against one another for sales opportunities over the decades.  I am honored to bring Snowbound into the Accusoft family,” said Jack Berlin, CEO of Accusoft.

“We were incredibly selective as we looked for the right acquisition partner. We were deliberate in selecting an organization with a leadership team and product portfolio that would be compatible with our own, and that would continue to grow, develop and nurture what we have built at Snowbound. We have proudly driven 26 years of innovation in the way that companies securely share, collaborate, and process documents and images.  With the acquisition, our technology will expand RasterMaster®’s and VirtualViewer®’s Java-based feature set and allow continued empowerment to customers as they navigate the ever-changing world of digital transformation and the complexities of document management,” Simon Wieczner, CEO Snowbound.

While the acquisition is complete, Accusoft will wait until January 2023 to take full operational control of Snowbound. In the meantime, the two leadership teams will partner to close out a strong 2022 and transition the team and its assets.

For more information about Accusoft, please visit https://www.accusoft.com/.

About Accusoft: 

Founded in 1991, Accusoft is a software development company specializing in content processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

About Snowbound

For over two decades, Snowbound Software has been the independent leader in document viewing and conversion technology. It plays an integral role in enhancing and speeding company workflows for the Fortune 2000, including insurance claims processing, financial transactions, and more. Snowbound excels in providing customers with powerful solutions for capturing, viewing, processing, and archiving hundreds of different document and image types. Thanks to its pure Java technology and multi-environment support, Snowbound’s products operate across all popular platforms and can be integrated into new or existing enterprise content management systems. Nine of the 10 largest banks in the United States (seven of 10 in the world), as well as some of the biggest healthcare providers, government agencies, and insurance companies rely on Snowbound for their mission-critical needs. For more information, contact us at 617-607-2010 or info@snowbound.com, or visit www.snowbound.com

On July 12, 2022, Accusoft announced the latest update to PrizmDoc, its industry-leading document processing integration. The PrizmDoc 13.21 update improves existing features and adds key functionality related to format support, redaction capabilities, content conversion, and more, allowing developers to offer enhanced functionality within their applications. 

One of the main improvements in this release is to PrizmDoc’s Content Conversion Service (CCS). PrizmDoc now provides the ability to convert PDF documents to MS Word (DOCX) documents, making shared collaboration easier than ever before.

Other features and updates in this release include: 

  • High-Efficiency Image File Format (HEIF, HEIC) support for viewing, redaction, and conversion to JPG/JPEG, PDF, PNG, SVG and TIFF. 
  • PrizmDoc Viewer Markup Burner API now provides the ability to burn in redaction reason text for transparent (draft mode) redactions and provides the ability to remove PDF AcroForm fields. 
  • Improved performance of the PAS GET MarkupLayers API when using AWS S3 storage, which significantly reduces network traffic between PAS and S3.

PrizmDoc provides customizable document processing to help developers deliver in-browser document creation, editing, and collaboration functionality, to enhance their software applications.

For more information about PrizmDoc or to download a free trial, please visit our website.

About Accusoft: 

Founded in 1991, Accusoft is a software development company specializing in document processing, conversion, and automation solutions. From out-of-the-box and configurable applications to APIs built for developers, Accusoft software enables users to solve their most complex workflow challenges and gain insights from content in any format, on any device. Backed by 40 patents, the company’s flagship products, including OnTask, PrizmDoc™ Viewer, and ImageGear, are designed to improve productivity, provide actionable data, and deliver results that matter. The Accusoft team is dedicated to continuous innovation through customer-centric product development, new version release, and a passion for understanding industry trends that drive consumer demand. Visit us at www.accusoft.com.

Question

I am trying to deploy my ImageGear Pro ActiveX project and am receiving an error stating

The module igPDF18a.ocx failed to load

when registering the igPDF18a.ocx component. Why is this occurring, and how can I register the component correctly?

Answer

To Register your igPDF18a.ocx component you will need to run the following command:

regsvr32 igPDF18a.ocx

If you receive an error stating that the component failed to load, then that likely means that regsvr32 is not finding the necessary dependencies for the PDF component.

The first thing you will want to check is that you have the Microsoft Visual C++ 10.0 CRT (x86) installed on the machine. You can download this from Microsoft’s site here:

https://www.microsoft.com/en-us/download/details.aspx?id=5555

The next thing you will want to check for is the DL100*.dll files. These files should be included in the deployment package generated by the deployment packaging wizard if you included the PDF component when generating the dependencies. These files must be in the same folder as the igPDF18a.ocx component in order to register it.

With those dependencies, you should be able to register the PDF component with regsvr32 without issue.