Technical FAQs

Question

In some other viewers, there are highlights or markers that appear on the UI to indicate that annotations are available for a given page or document. Is there a way to implement this in PrizmDoc?

Answer

Sure can, you just need to make a MarkupLayerRecords request to determine if there are marks that pertain to the given Viewing Session. Keep in mind that documents don’t really have a specific set of annotations associated with them though — Markup IDs do, and you can specify any Markup ID you want when you create a viewing session:

// Add rules to your CSS for the following classes.
// The actual style information can be whatever you like.
//
// .mark-indicator {
//     background-color: gold !important;
// }
//
// .marked-page-indicator {
//     background-color: gold !important;
// }

let pasUrl = "http://localhost/pas-service"; // Example PAS proxy URL
let viewingSessionId = <%= viewingSessionId %>; // Example viewingSessionId
let thumbnailButton = $(".pcc-icon-thumbnails");
let pageIndicatorsAdded = false;
let thumbnailsClicked = false;
let marksRetrieved = false;
let markedPages = {};

async function addPageIndicators() {
    console.log("Attempting to add page indicators...");
    if (thumbnailsClicked && marksRetrieved && !pageIndicatorsAdded) {
        console.log("Conditions met.");

        let wrappers = $(".pccThumbnailWrapper");

        wrappers.each(function(index, wrapper) {
            if (markedPages[index]) {
                $(wrapper).addClass("marked-page-indicator");
            }
        });

        pageIndicatorsAdded = true;
    } else {
        console.log("Conditions not met");
    }
}

thumbnailButton.click(function() {
    console.log("Thumbnails button clicked.");

    thumbnailsClicked = true;

    addPageIndicators();
});

async function apiCall(type, url, body = {}) {
    return await $.ajax({
        "type": type,
        "url": url,
        "contentType": "application/json",
        "data": JSON.stringify(body)
    });
}

async function createMarkIndicators() {
    let output = await apiCall("GET", `${pasUrl}/MarkupLayers/u${viewingSessionId}`);

    if (output.length > 0) {
        console.log("Found layers.");

        thumbnailButton.addClass("mark-indicator");

        let layers = await Promise.all(output.map(function(element) {
            return apiCall("GET", `${pasUrl}/MarkupLayers/u${viewingSessionId}/${element.layerRecordId}`);
        }));

        layers.forEach(function(layer) {
            layer.marks.forEach(function(mark) {
                markedPages[mark.pageNumber - 1] = true;
            });
        });

        marksRetrieved = true;

        console.log("Marks retrieved.");

        addPageIndicators();
    } else {
        console.log("No layers found.");
    }
}

createMarkIndicators(); 
Question

How can I tell which server has the cache for a specific document in a clustered PrizmDoc environment?

Answer

When a document is viewed, it creates a Viewing Session ID. That Viewing Session ID has information regarding which server in the cluster is doing the work, however, it is encoded and cannot be read directly.

In order to determine which PrizmDoc cluster server is doing the work for a specific document in a specific viewing session, you can do a text search for the Viewing Session ID in the plb.sep_multi.log on all servers. Only one server in the cluster will be a match for that Viewing Session ID.

Accusoft’s FormSuite for Structured Forms is a powerful SDK that allows you to integrate character recognition, form identification, document cleanup, and data capture capabilities into your software applications. You can set up unique form templates based on your processing needs and then design customized output architecture to extract data for delivery to a database or other downstream applications, helping you get to production faster or bring a new level of functionality to your legacy systems.

Setting all of that functionality up, however, can be a daunting task, especially if you’re working with a wide variety of form types. That’s why our FormSuite enablement services team is available to help you implement the features you need to ensure lasting results. Whether you’re facing bandwidth constraints or lack the resources to build expertise quickly, our FormSuite experts bridge the gap to make your project a success. Our enablement services team takes a five step approach to every engagement.

The Accusoft Approach to Enablement Services

Step 1: Thorough Architecture Review

We start by conducting a top to bottom analysis of your production or operational environment. Our review not only evaluates your system architecture and data workflow, but also breaks down the details of your potential use cases and existing work samples. 

Step 2: Identifying the Right Fit

Next, we determine the best FormSuite options based on your unique requirements and build you a custom enablement plan that will equip you with the instruction and assistance you need for a successful implementation.

Step 3: Training Your Team

Armed with information about your application’s specific requirements, we develop a customized training program to give your team a solid foundation for future development and ongoing maintenance. From guidance on form template creation and image enhancement to working with the forms API, we provide you with targeted guidance designed to help you solve potential challenges unique to your application environment.

Step 4: Implementation Support

Once the training is complete, you’ll have the foundational knowledge required to build the forms processing workflows your application requires. Our FormSuite experts remain on call to answer your questions so you can achieve your integration faster and ensure that you’re processing forms accurately.

Step 5: Preparing for Long-Term Success

Our enablement services prepare you to manage your implementation over the long term. We not only show you how to maintain the current environment, but also identify potential opportunities to deploy new features as your application scales in the future.

Keep the Partnership Going

Following your integration, we also provide ongoing support options to our customers whether or not they’ve utilized our enablement services. You get free Upgrade Support for 90 days after initial purchase, which includes email support and product upgrades. After that period, you can extend Upgrade Support, or elect to transition to our Standard Support or Priority Support annual plans.

To learn more about FormSuite for Structured Forms enablement services, talk to one of our solutions engineers. We’re ready to help you get your integration started!

Question

When using Content Conversion Services, what are the supported input formats that it takes for conversion?

Answer

When using Content Conversion Services, you can input any image and document source type that PrizmDoc supports.

Here’s a link to the Content Conversion Services API for more information.

Question

The http://localhost:3000/servicesConnection returns a 580 error. What could the issue be?

Answer

This indicates that PAS cannot communicate with the backend.

To fix this, verify that the PCCIS connection configuration section of pcc.*.yml is correct and that the backend is healthy and listening.

If the issue persists, check the logs/pas/pas-*.log files for network related issues.

For more information, see the Connection Issues topic.

Question

What are the absolute essentials for embedding the PrizmDoc Viewer into my web page/application?

Answer

Viewer API (viewercontrol.js)

The Viewer API is the base building block of the Viewer. We ensure that API changes are backward compatible with point releases (for example, PrizmDoc v13.5 → PrizmDoc v13.6) and will not introduce breaking changes unless critical. With major releases we also endeavor to ensure backward compatibility with previous releases of the Viewer API.

HTML Templates (viewerCustomization.js) and CSS (viewercontrol.css, viewer.css)

The Viewer that is shipped with the product will be maintained and enhanced from release to release. The Viewer HTML and CSS markup will change with each release. Once you have begun to modify your markup, it is recommended that you consider subsequent PrizmDoc releases as sample code, in which you would evaluate product changes and choose to incorporate all or parts of those changes into your customization.

JavaScript files (viewer.js)

The Viewer JavaScript that lies above the Viewer API is unobfuscated and open for customization. While we expect many developer needs will be satisfied through configuration parameters and minor HTML or styling changes, some developers will desire to modify viewer.js for more advanced customization. You should carefully consider your development and ongoing maintenance strategy to ensure that future releases of PrizmDoc are easy to integrate into your customizations. We cannot guarantee backward compatibility of viewer.js in future releases as it is central to the functionality of the Viewer.

For information on integrating PrizmDoc Viewer, see the Getting Started section of the documentation.

For more in depth customization, see the PrizmDoc Customization section of the documentation.

Question

What quality should my images be for processing form data and recognition using FormSuite?

Answer

In all cases, you want to have your images as clear and as clean as possible. For any particular procedure, please consider the following:

OCR and ICR: Capture images in at least 300 DPI resolution. Ideally, working in black and white allows the objects of interest on your image to be better defined and recognized. Free the image form all noise as much as possible. As if a human were reading it, you want the text objects on the image to be as legible as possible. For ICR, ensure that the characters are printed (no cursive text, etc).

Barcode recognition: As with OCR and ICR, capture images in at least 300 DPI and working with black and white content can provide excellent results. Ensure that the bars in the barcodes are clearly defined on the image and are not malformed (for example, the barcodes should have the proper start and stop sequence, etc). Clear as much noise from the image as possible.

Forms matching and registration: As with the prior 2 items above, capture your documents in at least 300 DPI. Ensure that your resolution is consistent between your form templates and incoming batch images. Form templates should only contain data that is common to every image that is being processed (i.e. Form fields, the text that appears on the blank form itself, etc). The template should not have filled-in field information as this will affect the forms matching process.

 

Streamline Forms with Automation 

Forms are part of everyday business activities. Whether you work in insurance or healthcare, retail or staffing, forms are necessary to get the job done. One of the biggest struggles of forms is their manual nature. Software tries to streamline the collection of data, but often you’re still left scanning information and manually storing files on a drive for later use. 

However, digital forms aren’t exclusively available for the companies making the big bucks. Forms automation software shouldn’t be expensive. Docubee helps small companies and big businesses alike streamline their workflow processes with an economical, eco-friendly, and efficient solution.

 


 

The Economical Advantage

Digital forms processing helps users save money by enabling them to create smarter forms, use less paper, and minimize the time it takes to collect data.  After creating a form, use it as a template. Save time and money by eliminating the manual recreation of forms. 

Docubee helps users stay on top of tasks and track their progress. Eliminate hours spent sifting through spreadsheets and automate progress tracking to prevent bottlenecks in your business process. As a no-code workflow tool, Docubee can map, replicate, and update business processes in minutes, proving to be highly economical.

Companies using Docubee find the transition to form automation an economical one that leads the company to success. Employees focus on innovative processes instead. Chad Otar Forbes Councils Member states in his article, How Automation Can Help Your Small Business: “The approach to automation, constantly looking for ways to incrementally improve and automate your business, is the slow and steady path to success. It also allows you to track and optimize each new effort without feeling like you have to overhaul your business from head to toe each time.”

 


 

The Efficiency Gains

When you create a standard form and map out your business process workflow, you’re creating a more productive work environment for your team. Docubee’s Fast Form Creator helps users create reusable forms, route them to the appropriate party, track progress, and reduce manual paper-based data entry. 

Docubee populates documents from data in web forms and other system databases making the outcome efficient and timely.  Docubee’s dynamic web forms enable companies to automate their processes with the use of mobile-friendly web forms. This process creates a faster turnaround time and higher form completion rates.  

According to TechFunnel, Human Resource Departments “… often fail to engage new candidates and potential employees well enough, while automated software can help not only use data to find better-qualified candidates, but also support collaboration between management and HR. IT also helps better monitor and track all recruitment and onboarding activities.” 

 


 

Digital vs. Physical 

Saving the planet from unduly waste of paper is a major concern for most companies.  Docubee helps eliminate paper waste by storing all data in a digital format. Forms processing is expedited by the use of Docubee’s digital signature feature. Using this tool, users can track the progress and approval of forms. Digitally sign documents anytime, anywhere, on any device, and stop using fax machines or scanners to process your signature. 

According to  Business Guide to Paper Reduction:  “There does not need to be a distinction between paper reduction efforts that are good for the environment and good for the bottom line. The two even amplify each other – while cost-savings will be the most tangible benefit, a reputation for being environmentally conscious can also be good for business.” Point blank, Docubee eliminates this waste as it helps companies succeed as an environmentally conscious partner.

Economical, efficient, and environmentally friendly, Docubee is a business process automation tool that every growing company can use to build and process forms easily. With its adaptability and ease-of-use, Docubee provides an economical and efficient way for companies big and small to automate their processes.

Question

The PrizmDoc Office Conversion Service on my localhost:18681/admin page says it’s “Starting” or has failed to start. What could the issue be?

Answer

This indicates that the license key provided is not licensed for the Microsoft Office Conversion feature.

To fix this, modify the fidelity.msOfficeDocumentsRenderer property in prizm-services-config.yml.

For information on the supported values, see the Fidelity section in the Central Configuration topic.

Question

What are the absolute essentials for embedding the PrizmDoc Viewer into my web page/application?

Answer

Viewer API (viewercontrol.js)

The Viewer API is the base building block of the Viewer. We ensure that API changes are backward compatible with point releases (for example, PrizmDoc v13.5 → PrizmDoc v13.6) and will not introduce breaking changes unless critical. With major releases we also endeavor to ensure backward compatibility with previous releases of the Viewer API.

HTML Templates (viewerCustomization.js) and CSS (viewercontrol.css, viewer.css)

The Viewer that is shipped with the product will be maintained and enhanced from release to release. The Viewer HTML and CSS markup will change with each release. Once you have begun to modify your markup, it is recommended that you consider subsequent PrizmDoc releases as sample code, in which you would evaluate product changes and choose to incorporate all or parts of those changes into your customization.

JavaScript files (viewer.js)

The Viewer JavaScript that lies above the Viewer API is unobfuscated and open for customization. While we expect many developer needs will be satisfied through configuration parameters and minor HTML or styling changes, some developers will desire to modify viewer.js for more advanced customization. You should carefully consider your development and ongoing maintenance strategy to ensure that future releases of PrizmDoc are easy to integrate into your customizations. We cannot guarantee backward compatibility of viewer.js in future releases as it is central to the functionality of the Viewer.

For information on integrating PrizmDoc Viewer, see the Getting Started section of the documentation.

For more in depth customization, see the PrizmDoc Customization section of the documentation.

OCR segmentation

Today’s high-speed forms processing workflows depend on accurate character recognition to capture data from document images. Rather than manually reviewing forms and entering data by hand, optical character recognition (OCR) and intelligent character recognition (ICR) allow developers to automate the data capture process while also cutting down on human error. Thanks to OCR segmentation, these tools are able to read a wide range of character types to keep forms workflows moving efficiently.

Recognizing Fonts

Deploying OCR to capture data is a complex undertaking due to the immense diversity of fonts in use. Modern character recognition software focuses on identifying the pixel patterns associated with specific characters rather than matching characters to existing libraries. This gives them the flexibility needed to discern multiple font types, but problems can still arise due to spacing issues that make it difficult to tell where one character ends and another begins.

Fonts generally come in one of two forms that impact how much space each character occupies. “Fixed” or “monospaced” fonts are uniformly spaced so that every character takes up the exact same amount of space on the page. While not quite as popular now in the era of word processing software and digital printing, fixed fonts were once the standard form of typeface due to the technical limitations of printing presses and typewriters. On a traditional typewriter, for example, characters were evenly spaced because each typebar (or striker) was a standardized size.

From an OCR standpoint, fixed fonts are easier to read because they can be neatly segmented. Each segmented character is the same size, no matter what letters, numbers, or symbols are used. In the example below, the amount of space occupied by the characters is determined by the number of characters used, not the shape of the characters themselves. This makes it easy to break the text down into a segmented grid for accurate recognition.

OCR segmentation:  Monospace Font Example

“Proportional” fonts, however, are not uniformly spaced. The amount of space taken up by each character is determined by the shape of the character itself. So while a w takes up the same space as an i in a fixed font, it takes up much more space in a proportional font.

OCR segmentation:  Fixed versus proportional font

The inherent characteristics of proportional fonts makes them more difficult to segment cleanly. Since each character occupies a variable amount of space, each segmentation box needs to be a different shape. In the example below, applying a standardized segmentation grid to the text would fail to cleanly separate individual characters, even though both lines feature the exact same character count.

Proportional Font Example

Yet another font challenge comes from “kerning,” which reduces the space between certain characters to allow them to overlap. Frequently used in printing, kerning makes for an aesthetically pleasing font, but it can create serious headaches for OCR data capture because many characters don’t separate cleanly. In the example below, small portions of the W and the A overlap, which could create confusion for an OCR engine as it analyzes pixel data. While the overlap is very slight in this example, many fonts feature far more extreme kerning.

Example of Kerning

In order to get a clean reading of printed text for more accurate recognition results, OCR engines like the one built into Accusoft’s SmartZone SDK utilize segmentation to take an image and split it into several smaller images before applying recognition. This allows the engine to isolate characters from one another to get a clean reading without any stray pixels that could impact recognition results.

Much of this process is handled automatically by the software. SmartZone, for instance, has OCR segmentation settings and properties that are handled internally based on the image at hand. In some cases, however, those controls may need to be adjusted manually to ensure the highest level of accuracy. If a specific font routinely returns failed or low confidence recognition results, it may be necessary to use the OCR segmentation properties to adjust for font characteristics like spaces, overlaps (kerning), and blob size (which distinguishes which pixels are classified as noise).

Applying ICR Segmentation

All of the challenges associated with cleanly segmenting printed text are magnified when it comes to hand printed text. Characters are rarely spaced or even shaped consistently, especially when they’re drawn without the guidance of comb lines that provide clear separation for the person completing a form.

Since ICR engines read characters as individual glyphs, they can become confused if overlapping characters are interpreted as a single glyph. In the example below, there is a slight overlap between the A and the c, while the cross elements of the f and t are merged to form the impression of a single character.

ICR Segmentation Properties

SmartZone’s ICR segmentation properties can be used to pull apart overlapping characters and split merged characters for more accurate recognition results. This is also important for maintaining a consistent character count. If the ICR engine isn’t accounting for overlapped and merged characters, it could return fewer character results than are actually present in the image.

Enhance Your Data Forms Capture with SmartZone

Accusoft’s SmartZone SDK supports both zonal and full page OCR/ICR for forms processing workflows to quickly and accurately capture information from document images. When incorporated into a forms workflow and integrated with identification and alignment tools like the ones found in FormSuite, users can streamline data capture and processing by extracting text and routing it to the appropriate databases or application tools. SmartZone’s OCR supports 77 distinct languages from around the world, including a variety of Asian and Cyrillic characters. For a hands-on look at how SmartZone can enhance your data capture workflow, download a free trial today.

 

Question

Can I remove your branding without the additional advanced features such as eSignature and annotation?

Answer

Yes, please contact us for more information on this request.