Package: repboxDoc 0.1.0

repboxDoc: Heuristically prepare PDF or HTML articles or appendices for further repbox analysis

Heuristically prepare PDF or HTML articles or appendices for further repbox analysis

Authors:Sebastian Kranz

repboxDoc_0.1.0.tar.gz
repboxDoc_0.1.0.zip(r-4.7)repboxDoc_0.1.0.zip(r-4.6)repboxDoc_0.1.0.zip(r-4.5)
repboxDoc_0.1.0.tgz(r-4.6-any)repboxDoc_0.1.0.tgz(r-4.5-any)
repboxDoc_0.1.0.tar.gz(r-4.7-any)repboxDoc_0.1.0.tar.gz(r-4.6-any)
repboxDoc_0.1.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
repboxDoc/json (API)

# Install 'repboxDoc' in R:
install.packages('repboxDoc', repos = c('https://skranz.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/repboxr/repboxdoc/issues

On CRAN:

Conda:

2.88 score 1 packages 8 scripts 147 exports 109 dependencies

Last updated from:e0f14b8502 (on main). Checks:7 WARNING, 2 OK. Indexed: no.

TargetResultTimeFilesSyslog
linux-devel-x86_64WARNING163
source / vignettesOK206
linux-release-x86_64WARNING141
macos-release-arm64WARNING98
macos-oldrel-arm64WARNING78
windows-develWARNING101
windows-releaseWARNING81
windows-oldrelWARNING127
wasm-releaseOK135

Exports:bind_rows_with_parent_fieldscell_df_joincell_df_to_tabhtmlchange_file_extcheck_and_repair_footnote_candidatescombine_short_paragraphscombine_text_linesdoc_dir_to_artiddoc_dir_to_project_directa_parse_htmlecta_parse_html_tableends.with.textensure_empty_typesexampleexample_mocr_makeextract_all_to_index_dfextract_num_from_sequence_textextract_order_num_from_sequence_textfind_wrong_mocrfirst_repair_rdoc_pdf_textfirst.non.nullfrom_toget_phrases_defguess_journ_from_artidhtml_tab_cell_row_panel_dfhtml_table_cells_from_all_trhtml_table_cells_from_trhtml_text_part_df_standardizeidentify_figure_lines_on_pageis_aer_pandpis_really_a_note_lineis.truejpe_parse_htmljpe_parse_html_tablekeep.overlapping.locleft_join_overlapline_df_find_figuresline_df_find_footnotesline_df_find_junk_linesline_df_find_page_header_footerline_df_find_section_candsline_df_find_sectionsline_df_to_part_dflines_to_pageslines_to_plinesload_phrases_defloc_sep_linesloc_to_dflocate_all_as_dflocate_col_refs_in_txtlocate_sentences_in_txtlocate_tab_fig_refs_in_txtmake_phrases_defmap_loc_to_parent_locmatch_overlapmocr_copy_to_ejsmocr_html_extract_tablesmocr_make_ocrmocr_md_to_html_by_pagemocr_md_to_html_monomocr_parse_html_partsmost.commonms_parse_htmlms_parse_html_tablemy_pandocmy_rankna.falsena.removena.valpdf_to_txt_pagesplines_to_linesrdoc_document_urlrdoc_find_in_text_fixedrdoc_formrdoc_guess_journrdoc_has_art_mocrrdoc_has_htmlrdoc_has_pdfrdoc_has_two_colrdoc_html_filerdoc_html_processrdoc_html_tab_standardizerdoc_html_to_partsrdoc_is_processedrdoc_load_art_meta_datardoc_load_page_dfrdoc_load_part_dfrdoc_load_ref_lirdoc_load_sent_dfrdoc_load_tab_dfrdoc_load_with_cacherdoc_mocr_processrdoc_optionsrdoc_optsrdoc_pdf_extract_raw_tabsrdoc_pdf_extract_tabsrdoc_pdf_filerdoc_pdf_pages_to_partsrdoc_pdf_processrdoc_pdf_to_txt_pagesrdoc_phrase_analysisrdoc_processrdoc_refs_analysisrdoc_repair_two_colrdoc_repair_two_col_aer_pandprdoc_sent_dfrdoc_steps_fromrdoc_tab_fig_refsrdoc_tab_phrase_analysisrdoc_tab_ref_textrdoc_tabs_filerdoc_text_parts_phrase_analysisrdoc_typerdoc_update_projectreadRDS.or.nullrefine_cell_df_and_add_panel_inforemove_nested_html_elementsremove.colsremove.overlapping.locrepair_ejd_files_art_mocrrepbox_all_pdf_filerepbox_doc_dirsrepbox_doc_file_selectrepbox_doc_files_inforepbox_doc_typesrepbox_journ_listrepbox_pdf_filerepbox_process_all_docsrepbox_rdoc_optsrestat_parse_htmlrestud_parse_htmlrestud_parse_html_tablerle_blockrle_cummax_blockrle_tablesave_rds_create_dirsentences_merge_with_nextseq_rowsshow_cell_df_htmlsubstitute_wrong_pdf_txt_charstext_df_add_section_colstext_parts_tab_fig_referencestext_parts_to_loctxt_locate_keywordstxt_locate_rx_keywordstxt_locate_typed_keywordstxt_phrase_analysis

Dependencies:askpassbase64encbrewbriobslibcachemcallrclicliprcommonmarkcpp11crayoncredentialscurldata.tabledescdevtoolsdiffobjdigestdownlitdplyrellipsisevaluateExtractSciTabfansifastmapfontawesomefsgenericsgertghgitcredsgluehighrhtmltoolshtmlwidgetshttpuvhttrhttr2inijquerylibjsonliteknitrlaterlifecyclemagrittrmemoisemimeminiUIopensslotelpakpillarpkgbuildpkgconfigpkgdownpkgloadpraiseprettyunitsprocessxprofvispromisespspurrrR6raggrappdirsrcmdcheckRcpprepboxDBrepboxTableToolsrepboxUtilsrestorepointrlangrmarkdownroxygen2rprojrootrstudioapirversionsrvestsassselectrsessioninfoshinysourcetoolsstringistringrstringtoolssyssystemfontstestthattextshapingtibbletidyrtidyselecttinytexurlcheckerusethisutf8vctrswaldowhiskerwithrxfunxml2xopenxtableyamlzip