Mozilla has once again ventured into the possibility of extracting meaningful information out of web pages to enable them to produce a more intelligent browser that will be able to meet up with the user requirements in a swift second. To achieve this goal, Mozilla will be digging through their Fathom JavaScript Framework.
Mozilla Firefox simply called as Firefox is a free and open-source web browser that was developed by the Mozilla Foundation and its subsidiary also known as Mozilla Corporation. Firefox can be used in Windows, macOS, Linux operating systems and Android for mobile phones
Fathom is arguably categorized as a “mini language” for writing semantic extractors, it is already in production with Firefox’s Activity Stream web traffic tracker, basically picking out page descriptions, images, and many other items, said Mozilla’s Erik Rose. Fathom is still in an early stage of development that”enables Firefox to understand the structure and possibly the contents of a web page,” he said. The implementation of this framework is done in browsers, server-side software and browser extensions.
The Applauded Features of Mozilla FireFox
There was the presentation of several scenarios in which Firefox proved to be able to understand pages in the same way as a person could do. For example, the browser exhibited recognization qualities and follows a log-in link.it can also provide hotkeys to dismiss popovers, put away superfluous navigation or header sections found on small screens, and determine what is printable without the need of print stylesheets.
To elaborate more on these scenarios presented, it was noted that the browser in question could be able to identify meaningful part of pages. Echoing the much-touted semantic web, Rose made reference to the previous attempts in this vein, such as Resource Description Framework, semantic tags, and microformats.
Attributes of Fathom:
Amongst the several features of Fathom, more importance was given to the fact that it is a data-flow language just like Prolog. It can extract meaning from web pages, identify essential parts like address forms, Previous and Next buttons, and the main body content of the page. DOM nodes are extracted based on user-specified conditions and scored with a system of types and annotations which expresses dependencies between the scoring steps and control state. The existing sets of scoring rules can possibly, be extended without directly editing them, so third-party refinements will be having a space of coexistence.
Fathom’s rule sets are data that look like JavaScript function calls, but the calls are making annotations in a version of a syntax tree. “Today, that gets us automatic tuning of score constants,” Rose said. “Tomorrow, it could get us automatic generation of rules themselves.”