Various data types, gathered without anyone’s knowledge or consent, are used to train large language models. You now have the option to determine whether Google may utilize your web content as data for their Bard AI and any foreseeable future models.
Disabling “User-Agent: Google-Extended” in your website’s robots.txt file—the file instructing automated web crawlers what material they are permitted to access—is all required to do this.
The use case for AI training is much different from web crawling, despite Google’s assertions that it develops its AI ethically and inclusively.
In an unexpected turn, Danielle Romain, the company’s Vice President of Trust, states in a blog post, “Web publishers have expressed a desire for increased options and influence regarding the utilization of their content in emerging generative AI applications.”
Curiously, the term “train” is conspicuously absent from the post despite the apparent utilization of this data as the foundation for developing machine learning models.
Instead, the VP of Trust asks you if you genuinely don’t want to “help improve Bard and Vertex AI generative APIs” — “to help these AI models become more accurate and capable over time.” It’s not like Google is stealing from you, you see. Your willingness to assist is what matters.
On the one hand, considering that consent is a crucial factor in this equation, and that Google should be requesting a positive contribution, it is the best approach to frame the subject. On the other hand, this framing needs to be more credible because Bard and its different models have already been trained using vast amounts of data collected from individuals without their consent.
The unavoidable truth shown by Google’s activities is that the company took advantage of unrestricted access to the data on the web, obtained what it needed, and now requests consent post-factum to make it appear as though consent and ethical data acquisition are essential to them. If it were, we would have been in this situation long ago.
Unrelatedly, Medium recently declared that it would be universally blocking crawlers like these until a better, more precise solution is found. Moreover, they are by no means the only ones.
[Source of Information: Techcrunch.com]
I am Aleena Parvez, I am a Content Editor and Proofreader with Proficient digital marketer skilled in devising and executing strategies to amplify brand exposure and stimulate conversions. Skilled in SEO, SEM, and social media, with a sharp ability to analyze data and optimize campaigns for optimal results. I am highly enthusiastic about utilizing technology and creativity to achieve tangible outcomes and surpass client anticipations.