\StaticSiteRewriteLinksTask
Rewrites content-links found in <img> "src" and <a> "href" HTML tag-attributes, which were originally imported via {@link StaticSiteImporter}.
The task takes two arguments:
- ImportID:
Allows the rewriter to know which content to rewrite, when duplicate imports exist.
- SourceID
Allows the rewriter to fetch the correct content relative to the given source of scraped URLs.
- SilentRun
Allows the rewriter to run silently, without displaying anything to stdout. This is the
default for when the task is run from the front-end.
All rewrite failures are written to an individual DataObject to power the
CMS . See also .
- Author: Sam Minnee <sam@silverstripe.com>
Synopsis
class StaticSiteRewriteLinksTask
extends BuildTask
{
- // members
- public static array $non_http_uri_schemes = ;
- public static string $summary_prefix = 'Import No.';
- protected string $description = 'Rewrites imported links into SilverStripe compatible format.';
- public number $currentPageId = NULL;
- public array $listFailedRewrites = ;
- protected $contentSourceID;
- protected $contentImportID;
- protected $silentRun;
- protected StaticSiteContentSource $contentSource = NULL;
- protected string $newLine = '';
- // methods
- public null run()
- public void printMessage()
- public void writeFailedRewrites()
- public array countFailureTypes()
- public boolean linkIsThirdParty()
- public boolean linkIsBadScheme()
- public boolean linkIsNotImported()
- public boolean linkIsAlreadyRewritten()
- public boolean linkIsJunk()
- public string badLinkType()
- public boolean ignoreUrl()
- public void setContentSourceID()
- public boolean checkInputs()
- public void printTaskInfo()
- protected void pushFailedRewrite()
Hierarchy
Extends
- BuildTask
Tasks
Line | Task |
---|---|
308+ | What to do with report summaries when a task for the same import is re-run? |
360+ | too many URLs being collected in $this->listFailedRewrites |
457+ | can we add a check for links with anchors to other pages? |
Members
protected
- $contentImportID
—
int
The import identifier - $contentSource
—
StaticSiteContentSource
The StaticSiteContentSource which has the links to be rewritten - $contentSourceID
—
int
The ID number of the StaticSiteContentSource which has the links to be rewritten - $description — string
- $newLine — string
- $silentRun —
public
- $currentPageId — number
- $listFailedRewrites
—
array
Stores the dodgy URLs for later analysis - $non_http_uri_schemes
—
array
An inexhaustive list of non http(s) URI schemes which we don't want to try to normalise. - $summary_prefix — string
Methods
protected
- pushFailedRewrite() — Build an array of failed URL rewrites for later reporting.
public
- badLinkType() — What kind of bad link is $link? The returned string should match the ENUM values on FailedURLRewriteObject
- checkInputs() — Checks the user-passed data is cotia.
- countFailureTypes() — Returns an array of totals of all the failed URLs, in different categories according to: - No. Non $baseURL http(s) URLs - No. Non http(s) URI schemes (e.g. mailto, tel etc) - No. URLs not imported - No. Junk URLs (i.e. those not matching any of the above)
- ignoreUrl() — Whether or not to ingore a URL. Returns true if a URL is either:
- linkIsAlreadyRewritten() — Detects if a link has already been re-written.
- linkIsBadScheme() — Detects if a link uses an unsupported protocol (e.g. mailto, tel etc)
- linkIsJunk() — Link begins with non-legitimate character
- linkIsNotImported() — After rewrite task is run, link doesn't match a valid CMS link shortcode.
- linkIsThirdParty() — Detects if a link is to a third-party website.
- printMessage() — Prints notices and warnings and aggregates them into two lists for later analysis, depending on $level and whether you're using the CLI or a browser to run the task.
- printTaskInfo() — Prints information on the options available for running the task and debugging and usage examples.
- run() — Starts the task
- setContentSourceID() — Set the ID number of the StaticSiteContentSource
- writeFailedRewrites() — Write failed rewrites to the {@link BadImportLog} for later analysis by users via the CMS' Report admin.